Sandboxing agent-generated code with disposable unikernels

Coding agents run a lot of commands on your machine. npm test, python script.py, gh pr create, rm to clean up a stale build directory. All of those execute with the developer’s credentials. The agent’s reasoning failure becomes your rm -rf.

The standard responses to this are all bad in their own way:

What’s missing is a local, per-call, VM-isolated execution primitive that an agent can just wrap around every shell command. I built one and called it unitask.

What it does

You hand unitask a piece of code and a policy. It boots a fresh unikernel, runs the code inside, returns the result, and destroys the VM. Sub-second end-to-end on Apple Silicon, ~1–2 seconds on Linux without KVM. Every run leaves a trace on disk — code, policy, stdout, stderr, network log, exit code — but no live VM, no persistent state, no surface for the next call to inherit.

caller agent / CLI code + policy unitask image + args QEMU per-run VM Nanos unikernel worker runs stdout / exit result ~/.unitask/runs/<id>/ trace
Per-call lifecycle: caller hands unitask code and policy; unitask builds and boots a fresh unikernel image; the worker runs to completion; the VM is destroyed; the trace persists.

The runtime is Nanos, a unikernel that links your worker directly against a minimal library OS in a single address space. No shell. No userland. No package manager. The “OS” is exactly what the worker needs — Node, a libc, the syscalls Nanos implements — and nothing else.

Why unikernels, not microVMs

The obvious alternative is a microVM like Firecracker or Cloud Hypervisor. Firecracker is what AWS Lambda uses; it has serious production hardening; it’s the safe choice. I picked unikernels anyway, for three reasons:

Attack surface. A microVM still boots a Linux guest with a kernel, a userland, a shell, sometimes a service manager. That’s a lot of code the worker can poke at if it escapes the language runtime. A unikernel is the worker plus a library OS — nothing else in the image to compromise.

Boot time. Firecracker boots in 100–150ms on a good day; cold-cache it can take longer. A Nanos unikernel from a pre-built image boots in roughly 100ms once QEMU is up, and the QEMU setup itself amortizes across runs in the same process. End-to-end I see ~500ms on M-series under HVF and ~1–2s on Linux under TCG software emulation. That’s slow enough to feel, fast enough to put inline behind every agent shell call.

The “OS” is what the worker needs. Packaging a per-language runtime as a unikernel is more work than putting it in a container image. In return, the resulting image has no general-purpose Linux underneath it. There’s no /bin/sh to shell out through. There’s nothing to LD_PRELOAD against. The threat model gets smaller because the surface gets smaller.

The tradeoffs are real. The unikernel ecosystem is much smaller than Linux. Some syscalls aren’t implemented and surface as ENOENT in unexpected places. Packaging a new language runtime is custom work. These are accepted costs for the threat-model gains, not incidental ones.

The policy model

The “code in, runs, returns” half is straightforward. The half that matters in practice is the policy:

That’s the per-call layer. There’s also a per-project layer.

The ceiling pattern

The caller in a typical agent setup isn’t a human — it’s the agent. You don’t want the agent declaring its own permissions. So unitask reads a .unitask.toml from the project root (walking up like git or tsc) and treats it as a ceiling on every call:

memoryMb       = 512
timeoutSeconds = 60
allowNet       = ["api.github.com", "api.openai.com"]
allowTcp       = ["127.0.0.1:5432"]
secrets        = ["GITHUB_TOKEN", "OPENAI_API_KEY"]
filesUnder     = ["/Users/me/work"]
dirsUnder      = ["/Users/me/work"]

The effective policy on each run is the intersection of what the caller requested and what the ceiling allows. Scalars get clamped; lists get intersected. Anything dropped lands in the run record’s policyCeiling.denials for the audit trail.

caller request allow_net = [github, openai, slack] what the agent asked for .unitask.toml allow_net = [github, openai] what the host permits intersection effective allow_net = [github, openai] applied to the run denials: [slack] → recorded in the run trace, not applied
Each call's effective policy is the intersection of the agent's request and the project's .unitask.toml ceiling. Anything dropped is preserved in the run record.

The host declares the maximum. The caller narrows from there. Agents can never escalate beyond the ceiling, and you can see in the trace exactly what they tried.

MCP as the integration surface

unitask runs as both a CLI and an MCP server. The MCP server is the more interesting half because it makes the tool drop-in for any agent that speaks the protocol — Claude Code, Cursor, VS Code Copilot, any custom backend using the MCP SDK.

Wiring it into a coding agent is one JSON snippet:

{
  "mcpServers": {
    "unitask": { "command": "unitask", "args": ["mcp"] }
  }
}

After a restart, the agent has three new tools: run_code (run a piece of code under policy), inspect_run (read a past run’s trace), doctor (check whether code execution is even possible). The agent doesn’t know it’s talking to QEMU. It’s calling a tool that happens to be sandboxed.

For app backends — chatbots with code-interpreter modes, eval pipelines, workflow builders that execute user code — the integration is the same shape but via the MCP SDK directly. Spawn unitask mcp as a subprocess, talk MCP over stdio, get a sandboxed run_code tool. One subprocess per request: fine for prototyping and low-to-moderate volume, not the right shape for a high-throughput shared service (a remote HTTP/SSE transport isn’t built yet).

Where this fits

The agent stack is going to grow a security layer the same way the web stack grew TLS — slowly at first, then everywhere, once enough breaches make the case. The shape of that layer isn’t obvious yet. unitask is one bet on what it could look like at the per-call code-execution slot: local, VM-isolated, ephemeral, declaratively constrained, traceable.

Code is at github.com/jnormore/unitask.