Compute zones — bRRAIn Docs
How bRRAIn partitions and isolates compute for safety, performance, and security.
Compute zones
A compute zone is a partition of execution within your brain pod. Zones isolate workloads from each other so a runaway extension can't starve the rest of the system, an experimental adapter can't see data outside its boundary, and an audit can prove what ran where.
Why zones
Three reasons:
- Isolation — extensions get their own slice of CPU, RAM, and (where present) GPU. They can't escape into another extension's slice.
- Safety — the bRRAIn Vault writer and the audit logger run in their own zone, so they keep functioning when an extension misbehaves.
- Auditability — every action is tagged with the zone it ran in, simplifying after-the-fact reasoning.
Zone types
A pod runs four classes of zones:
System zone
Where the bRRAIn Consolidator, Handler, Vault writer, audit logger, and supervisor live. Highest reliability. Reserved capacity. Cannot be paused, evicted, or starved by other zones.
You don't manage the system zone directly. It's there, healthy, doing its job.
Extension zones
One per installed marketplace extension. Each has:
- A configured CPU budget (vCPUs).
- A configured memory budget (RAM).
- An optional GPU budget (whole or fractional GPU, when GPUs are present).
- A network egress allowance.
- A storage allocation under the pod's network volume.
Budgets are enforced by the supervisor. Exceeding them gets your zone throttled (CPU) or out-of-memory-killed (RAM); the supervisor logs the event and restarts the zone.
Workload zones
Ad-hoc zones the supervisor spins up for specific jobs — a one-off training run, an evaluation suite, a batch document import. Workload zones live for the lifetime of the job and are reaped on completion.
Sandbox zone
A locked-down zone where untrusted code runs (Code Sandbox feature). No network egress except through approved channels, no Vault write, ephemeral storage. Used when you need to execute something whose provenance you don't fully trust.
How zone membership is decided
When an extension is installed, its manifest declares its preferred zone class and its requested resources. The supervisor decides:
- Class — usually extension class. Sometimes sandbox (for experimental extensions) or workload (for short-lived ones).
- Resources — clamped to your plan limits and to what your pod can actually offer.
- Co-tenancy — extensions can be configured to never share a zone with another specified extension (for compliance separations).
Zone settings are visible per-extension on the Console → Installed extensions → [Extension] → Settings page.
Resource scheduling
Within a zone, workloads share the zone's allocation cooperatively. CPU is allocated by the OS scheduler; memory comes out of the zone's pool; GPU time is divided by the GPU runtime where applicable.
Across zones, the supervisor enforces fairness and isolation. The system zone is never preempted. Extension zones can be preempted by the supervisor in narrow cases:
- The system zone needs unexpected headroom (rare; the system zone has reserved capacity).
- A higher-priority workload zone needs the resources (only when you've configured priorities).
- The pod is being upgraded and is draining workloads.
Networking
Each zone has an isolated network namespace. Inter-zone communication goes through:
- The platform SDK loopback, for extension-to-platform calls.
- The supervisor's internal API for system services.
- The MCP Gateway, for outbound calls to integrations.
There's no direct extension-to-extension network path. If two extensions need to coordinate, they coordinate through the platform.
Storage
Each zone has its own slice of the pod's network volume:
- Extension zones get a per-extension directory at
/opt/brrain/extensions/<slug>/. - Workload zones get an ephemeral scratch directory.
- The system zone owns the rest of the volume.
Vault data is owned by the system zone; extensions read and write the Vault through the platform SDK, never directly.
Observability
Per-zone telemetry surfaces in Console → Observability:
- CPU utilization.
- Memory in use vs allocated.
- GPU utilization where present.
- Network egress.
- Storage used.
Per-zone alerts can be configured (e.g., "alert if extension X memory exceeds 80% for more than 10 minutes").
Failure handling
Zone-level failures are contained:
- An out-of-memory kill in an extension zone restarts that zone, not the pod.
- A crash loop triggers exponential backoff and an alert.
- A zone that fails to start gets one retry and then is marked failed.
The system zone is monitored separately; system-zone health drives the pod's overall health pip.
Where to next
- Vaults — how the Vault is partitioned into data zones (different concept; same word).
- MCP Gateway — how outbound calls from any zone are governed.
- Console: Observability — per-zone telemetry.