Sandboxing agent-generated code with disposable unikernels
Coding agents run a lot of commands on your machine. npm test, python script.py, gh pr create, rm to clean up a stale build directory. All of those execute with the developer’s credentials. The agent’s reasoning failure becomes your rm -rf.
The standard responses to this are all bad in their own way:
- Just run it on the host. What most agents do today. Containers help, but a container is one kernel escape thick, and most agent setups don’t bother with one anyway.
- Use a cloud sandbox. E2B, OpenAI’s Code Interpreter. Remote, vendor-tied, persistent state across calls, billed by the second. Fine for hosted products. Wrong shape for a developer-machine coding agent.
- Use an OS sandbox. firejail, sandbox-exec, bwrap. Better than nothing. Not VM-grade, and the policy language for each is different.
- Roll your own VM tooling. Kernel selection, image building, lifecycle, policy, audit — months of work nobody wants to do.
What’s missing is a local, per-call, VM-isolated execution primitive that an agent can just wrap around every shell command. I built one and called it unitask.
What it does
You hand unitask a piece of code and a policy. It boots a fresh unikernel, runs the code inside, returns the result, and destroys the VM. Sub-second end-to-end on Apple Silicon, ~1–2 seconds on Linux without KVM. Every run leaves a trace on disk — code, policy, stdout, stderr, network log, exit code — but no live VM, no persistent state, no surface for the next call to inherit.
The runtime is Nanos, a unikernel that links your worker directly against a minimal library OS in a single address space. No shell. No userland. No package manager. The “OS” is exactly what the worker needs — Node, a libc, the syscalls Nanos implements — and nothing else.
Why unikernels, not microVMs
The obvious alternative is a microVM like Firecracker or Cloud Hypervisor. Firecracker is what AWS Lambda uses; it has serious production hardening; it’s the safe choice. I picked unikernels anyway, for three reasons:
Attack surface. A microVM still boots a Linux guest with a kernel, a userland, a shell, sometimes a service manager. That’s a lot of code the worker can poke at if it escapes the language runtime. A unikernel is the worker plus a library OS — nothing else in the image to compromise.
Boot time. Firecracker boots in 100–150ms on a good day; cold-cache it can take longer. A Nanos unikernel from a pre-built image boots in roughly 100ms once QEMU is up, and the QEMU setup itself amortizes across runs in the same process. End-to-end I see ~500ms on M-series under HVF and ~1–2s on Linux under TCG software emulation. That’s slow enough to feel, fast enough to put inline behind every agent shell call.
The “OS” is what the worker needs. Packaging a per-language runtime as a unikernel is more work than putting it in a container image. In return, the resulting image has no general-purpose Linux underneath it. There’s no /bin/sh to shell out through. There’s nothing to LD_PRELOAD against. The threat model gets smaller because the surface gets smaller.
The tradeoffs are real. The unikernel ecosystem is much smaller than Linux. Some syscalls aren’t implemented and surface as ENOENT in unexpected places. Packaging a new language runtime is custom work. These are accepted costs for the threat-model gains, not incidental ones.
The policy model
The “code in, runs, returns” half is straightforward. The half that matters in practice is the policy:
- Network is default-deny. QEMU boots with
-nic none. The worker has no network unless someone opens it. - Per-host allowlist for HTTP/HTTPS.
--allow-net api.openai.comopens that specific host (and only that host) through an HTTP CONNECT proxy that validates the host header on every request. - Per-
(host:port)allowlist for raw TCP.--allow-tcp 127.0.0.1:5432exposes Postgres on the loopback, nothing else. - Read-only host mounts.
--file ./data.csvand--dir ./srcinject specific paths read-only at known mountpoints. The image is destroyed at run end so the host originals are untouched. - Secret injection that’s not logged.
--secret OPENAI_API_KEYresolves the value from host env, bakes it into the image’s env, and never writes it to the run record. Captured stdout and stderr are scrubbed of the value post-run. - Wall-clock timeout, memory cap, hard kill. No runaway processes.
That’s the per-call layer. There’s also a per-project layer.
The ceiling pattern
The caller in a typical agent setup isn’t a human — it’s the agent. You don’t want the agent declaring its own permissions. So unitask reads a .unitask.toml from the project root (walking up like git or tsc) and treats it as a ceiling on every call:
memoryMb = 512
timeoutSeconds = 60
allowNet = ["api.github.com", "api.openai.com"]
allowTcp = ["127.0.0.1:5432"]
secrets = ["GITHUB_TOKEN", "OPENAI_API_KEY"]
filesUnder = ["/Users/me/work"]
dirsUnder = ["/Users/me/work"]
The effective policy on each run is the intersection of what the caller requested and what the ceiling allows. Scalars get clamped; lists get intersected. Anything dropped lands in the run record’s policyCeiling.denials for the audit trail.
.unitask.toml ceiling. Anything dropped is preserved in the run record.The host declares the maximum. The caller narrows from there. Agents can never escalate beyond the ceiling, and you can see in the trace exactly what they tried.
MCP as the integration surface
unitask runs as both a CLI and an MCP server. The MCP server is the more interesting half because it makes the tool drop-in for any agent that speaks the protocol — Claude Code, Cursor, VS Code Copilot, any custom backend using the MCP SDK.
Wiring it into a coding agent is one JSON snippet:
{
"mcpServers": {
"unitask": { "command": "unitask", "args": ["mcp"] }
}
}
After a restart, the agent has three new tools: run_code (run a piece of code under policy), inspect_run (read a past run’s trace), doctor (check whether code execution is even possible). The agent doesn’t know it’s talking to QEMU. It’s calling a tool that happens to be sandboxed.
For app backends — chatbots with code-interpreter modes, eval pipelines, workflow builders that execute user code — the integration is the same shape but via the MCP SDK directly. Spawn unitask mcp as a subprocess, talk MCP over stdio, get a sandboxed run_code tool. One subprocess per request: fine for prototyping and low-to-moderate volume, not the right shape for a high-throughput shared service (a remote HTTP/SSE transport isn’t built yet).
Where this fits
The agent stack is going to grow a security layer the same way the web stack grew TLS — slowly at first, then everywhere, once enough breaches make the case. The shape of that layer isn’t obvious yet. unitask is one bet on what it could look like at the per-call code-execution slot: local, VM-isolated, ephemeral, declaratively constrained, traceable.
Code is at github.com/jnormore/unitask.