Sandbox Isolation for Coding Agents
Sandbox Isolation for Coding Agents
Section titled “Sandbox Isolation for Coding Agents”Confidence: Tier 2 — Official Docker docs + verified vendor documentation Reading time: ~10 minutes Scope: Running Claude Code safely in isolated environments
| Solution | Isolation | Local/Cloud | Best For |
|---|---|---|---|
| Docker Sandboxes | microVM (hypervisor) | Local | Max security, Docker-in-Docker needed |
| Native CC sandbox | Process (Seatbelt/bubblewrap) | Local | Lightweight daily dev, trusted code |
| Fly.io Sprites | Firecracker microVM | Cloud | API-driven agent workflows |
| E2B | Firecracker microVM | Cloud | Multi-framework AI apps |
| Vercel Sandboxes | Firecracker microVM | Cloud | Next.js / Vercel ecosystem |
| Cloudflare Sandbox SDK | Container | Cloud | Workers-based serverless |
Quick start:
docker sandbox run claude ~/my-project1. The Problem: Safe Autonomy
Section titled “1. The Problem: Safe Autonomy”Claude Code’s permission system protects you from unintended actions. But it creates a tension:
--dangerously-skip-permissionsremoves all guardrails — Claude canrm -rf,git push --force, orDROP TABLEwithout asking. On a bare host, this is dangerous.- Permission fatigue — approving every file edit and shell command slows down autonomous workflows. For large refactors or CI pipelines, interactive approval is impractical.
- The gap: How do you run Claude Code autonomously AND safely?
Answer: Isolate the execution environment. Let the agent run free inside a sandbox where the blast radius is contained. The sandbox is the security boundary, not the permission system.
2. Isolation Approaches
Section titled “2. Isolation Approaches”3. Docker Sandboxes
Section titled “3. Docker Sandboxes”Source: docs.docker.com/ai/sandboxes/ Requires: Docker Desktop 4.58+ (macOS or Windows)
Docker Sandboxes run AI coding agents in microVM-based isolation on your local machine. Each sandbox gets its own private Docker daemon and filesystem. Sandboxes do NOT appear in docker ps — they are VMs, not containers.
Quick Start
Section titled “Quick Start”# Create and run a sandbox with your projectdocker sandbox run claude ~/my-project
# Run with autonomous mode (safe inside sandbox)docker sandbox run claude ~/my-project -- --dangerously-skip-permissions
# Pass a prompt directlydocker sandbox run claude ~/my-project -- "Refactor auth module to use JWT"
# Continue a previous sessiondocker sandbox run my-sandbox -- --continueArchitecture
Section titled “Architecture”┌──────────────────────────────────────────────────────────┐│ HOST MACHINE ││ ││ ┌────────────────────────────────────────────────────┐ ││ │ DOCKER SANDBOX (microVM) │ ││ │ │ ││ │ ┌──────────────┐ ┌───────────────────────────┐ │ ││ │ │ Claude Code │ │ Private Docker daemon │ │ ││ │ │ (--dsp mode) │ │ (isolated from host) │ │ ││ │ └──────────────┘ └───────────────────────────┘ │ ││ │ │ ││ │ ┌──────────────────────────────────────────────┐ │ ││ │ │ Workspace: ~/my-project (synced with host) │ │ ││ │ │ Same absolute path as host │ │ ││ │ └──────────────────────────────────────────────┘ │ ││ │ │ ││ │ Base: Ubuntu, Node.js, Python 3, Go, Git, │ ││ │ Docker CLI, GitHub CLI, ripgrep, jq │ ││ │ User: non-root 'agent' with sudo │ ││ └────────────────────────────────────────────────────┘ ││ ││ Host Docker daemon: NOT accessible from sandbox ││ Host filesystem: NOT accessible (except workspace) │└──────────────────────────────────────────────────────────┘Key properties:
- Workspace sync: Host directory mounts at the same absolute path inside the sandbox
- Full isolation: Agent cannot access host Docker daemon, host containers, or files outside workspace
- Private Docker: Each sandbox has its own Docker daemon for building/running containers
- Claude runs with
--dangerously-skip-permissions: Intentional — the sandbox is the security boundary
Network Policies
Section titled “Network Policies”Control what the sandbox can access on the network.
# View network activitydocker sandbox network log my-sandbox
# Set up denylist mode (block all, allow specific)docker sandbox network proxy my-sandbox \ --policy deny \ --allow-host api.anthropic.com \ --allow-host "*.npmjs.org" \ --allow-host "*.pypi.org" \ --allow-host github.com
# Set up allowlist mode (allow all, block specific)docker sandbox network proxy my-sandbox \ --policy allow \ --block-host "*.malicious-domain.com" \ --block-cidr "192.168.0.0/16"| Mode | Default behavior | Use case |
|---|---|---|
| Allowlist (default) | Permits most traffic, blocks specific destinations | General development |
| Denylist | Blocks all traffic, allows only specified destinations | High-security environments |
Default blocked ranges: Private CIDRs (10.0.0.0/8, 127.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 169.254.0.0/16) and IPv6 equivalents.
Pattern matching: Exact (example.com), port-specific (example.com:443), wildcards (*.example.com matches subdomains only). Most specific pattern wins.
Security caveat: Domain filtering does not inspect traffic content. Broad allowances (e.g., github.com) permit access to user-generated content. HTTPS inspection is not performed in bypass mode.
Config storage: Per-sandbox at ~/.docker/sandboxes/vm/[name]/proxy-config.json. Policies persist across restarts.
Custom Templates
Section titled “Custom Templates”For teams needing reproducible environments with specific tooling:
FROM docker/sandbox-templates:claude-code
USER root
# Install project-specific dependenciesRUN apt-get update && apt-get install -y \ postgresql-client \ redis-tools
# Install global npm packagesRUN npm install -g pnpm turbo
USER agentBuild and use:
# Build the templatedocker build -t my-team-sandbox:v1 .
# Create sandbox with custom templatedocker sandbox create my-sandbox \ --template my-team-sandbox:v1 \ --load-local-template ~/my-projectUse custom templates when: team environments, specific tool versions, repeated setups, complex configurations. For simple one-off work, use the defaults and let the agent install what it needs.
Commands Reference
Section titled “Commands Reference”| Command | Description |
|---|---|
docker sandbox run <agent> <path> | Create and start a sandbox |
docker sandbox create <name> | Create without starting |
docker sandbox ls | List all sandboxes |
docker sandbox run <name> -- "prompt" | Pass a prompt |
docker sandbox run <name> -- --continue | Continue previous session |
docker sandbox run <name> -- --dsp | Short for —dangerously-skip-permissions |
docker sandbox network proxy <name> | Configure network policies |
docker sandbox network log <name> | View network activity |
Authentication
Section titled “Authentication”Option 1: API key (recommended for headless)
Set ANTHROPIC_API_KEY in ~/.bashrc or ~/.zshrc. The sandbox daemon reads from these files, not the current shell session. Restart the daemon after changes. Persists across sandbox recreation.
Option 2: Interactive login (per-session)
Triggered automatically if no credentials found. Use /login inside Claude Code to trigger manually. Authentication does NOT persist when the sandbox is destroyed.
Supported Agents
Section titled “Supported Agents”| Agent | Provider | Status |
|---|---|---|
| Claude Code | Anthropic | Full support |
| Codex CLI | OpenAI | Supported |
| Gemini CLI | Supported | |
| cagent | Docker | Supported |
| Kiro | AWS | Supported |
Limitations
Section titled “Limitations”- macOS and Windows only for microVM mode. Linux uses legacy container-based sandboxes (Docker Desktop 4.57+).
- Docker Desktop required — not available with standalone Docker Engine. Community alternatives like dclaude (Patrick Debois) wrap Claude Code in standard Docker containers for Docker Engine-only environments, but use container isolation (not microVM) and mount the host Docker socket — weaker security boundary.
- MCP Gateway not yet supported inside sandboxes.
- No GPU passthrough — not suitable for ML training workloads.
- Workspace sync is one-way: changes inside the sandbox propagate to the host, but concurrent host edits may conflict.
4. Native Claude Code Sandbox
Section titled “4. Native Claude Code Sandbox”Source: code.claude.com/docs/en/sandboxing Requires: macOS (built-in) or Linux/WSL2 (bubblewrap + socat) Feature: Claude Code v2.1.0+
Claude Code includes built-in native sandboxing using OS-level primitives for process-level isolation. No Docker required.
Architecture
Section titled “Architecture”┌──────────────────────────────────────────────────────┐│ HOST MACHINE ││ ││ Claude Code (main process) ││ │ ││ ├─ spawn bash command ││ │ ││ ▼ ││ Sandbox wrapper (Seatbelt/bubblewrap) ││ │ ││ ├─ Filesystem: read all, write CWD only ││ ├─ Network: SOCKS5 proxy, domain filtering ││ ├─ Process: isolated environment ││ │ ││ ▼ ││ Command executes with restrictions ││ │ ││ └─ Violations blocked at OS level ││ │└──────────────────────────────────────────────────────┘Key differences from Docker Sandboxes:
| Aspect | Native Sandbox | Docker Sandboxes |
|---|---|---|
| Isolation level | Process (Seatbelt/bubblewrap) | microVM (hypervisor) |
| Kernel | Shared with host | Separate kernel per sandbox |
| Setup | 0 dependencies (macOS), 2 packages (Linux) | Docker Desktop 4.58+ |
| Overhead | Minimal (~1-3% CPU) | Moderate (~5-10% CPU, +200MB RAM) |
| Docker-in-Docker | ❌ Not supported | ✅ Private Docker daemon |
| Use case | Daily dev, trusted code | Untrusted code, max isolation |
OS Primitives
Section titled “OS Primitives”macOS: Uses Seatbelt (TrustedBSD Mandatory Access Control)
- Built-in, works out of the box
- Kernel-level system call filtering
Linux/WSL2: Uses bubblewrap (Linux namespaces + seccomp)
- Requires installation:
sudo apt-get install bubblewrap socat - Creates isolated namespace per command
WSL1: ❌ Not supported (bubblewrap needs kernel features unavailable)
Windows native: ⏳ Planned (not yet available)
Quick Start
Section titled “Quick Start”# Enable sandboxing (interactive menu)/sandbox
# Linux/WSL2 only: install prerequisites firstsudo apt-get install bubblewrap socat # Ubuntu/Debiansudo dnf install bubblewrap socat # FedoraTwo modes:
- Auto-allow mode: Bash commands auto-approved if sandboxed (recommended for daily dev)
- Regular permissions mode: All commands require approval (for high-security)
Configuration Example
Section titled “Configuration Example”{ "sandbox": { "autoAllowMode": true, "network": { "policy": "deny", "allowedDomains": [ "api.anthropic.com", "registry.npmjs.com", "github.com" ] } }, "permissions": { "deny": [ "Read(~/.ssh/**)", "Read(~/.aws/**)", "Edit(~/.ssh/**)", "Edit(~/.aws/**)" ] }}When to Use Native vs Docker
Section titled “When to Use Native vs Docker”Use Native Sandbox when:
- ✅ Daily development with trusted team
- ✅ Lightweight setup (no Docker Desktop)
- ✅ Minimal overhead priority
- ✅ Code is mostly trusted
- ✅ Don’t need Docker-in-Docker
Use Docker Sandboxes when:
- ✅ Running untrusted code
- ✅ Maximum security isolation (kernel exploits protection)
- ✅ Need private Docker daemon inside sandbox
- ✅ Testing AI-generated scripts
- ✅ Production CI/CD with sensitive workloads
Decision tree:
Daily development?├─ Trusted code + team → Native Sandbox (lightweight)└─ Untrusted scripts → Docker Sandboxes (max isolation)
Need Docker inside?├─ Yes → Docker Sandboxes (only option)└─ No → Either works, prefer Native for simplicity
Maximum security?├─ Yes (kernel exploit protection) → Docker Sandboxes└─ Standard (process isolation OK) → Native SandboxSecurity Limitations
Section titled “Security Limitations”⚠️ Native Sandbox limitations (see guide/sandbox-native.md for details):
- Shared kernel: Vulnerable to kernel exploits (Docker microVM protects against this)
- Domain fronting: CDN-based bypass possible (Cloudflare, Akamai)
- Unix sockets: Can grant unexpected privileges if misconfigured
- Filesystem: Overly broad write permissions enable privilege escalation
For untrusted code, Docker Sandboxes provide stronger isolation.
Open-Source Runtime
Section titled “Open-Source Runtime”The sandbox implementation is available as an open-source npm package:
# Use sandbox runtime directlynpx @anthropic-ai/sandbox-runtime <command-to-sandbox>
# Example: sandbox an MCP servernpx @anthropic-ai/sandbox-runtime node mcp-server.jsRepository: github.com/anthropic-experimental/sandbox-runtime
Deep Dive
Section titled “Deep Dive”For complete technical details, configuration examples, troubleshooting, and security analysis:
Covers: OS primitives, network proxy architecture, sandbox modes, escape hatch, security limitations, best practices.
5. Cloud Sandboxes Landscape
Section titled “5. Cloud Sandboxes Landscape”Fly.io Sprites
Section titled “Fly.io Sprites”Source: sprites.dev
Hardware-isolated execution environments built on Firecracker microVMs, by Fly.io.
- Isolation: Firecracker microVMs with full hardware isolation
- Persistence: Fully mutable ext4 filesystem, automatic 100GB partition
- Checkpoint/restore: Live checkpoints in ~300ms (copy-on-write), restore under 1 second
- HTTP access: Individual URLs per Sprite, auto-activation on requests (cold-start under 1s)
- Network: Layer 3 egress policies, public/private toggles
- Resources: Up to 8 CPUs, 16GB RAM per Sprite
- API: CLI (
spritecommand), REST API, JavaScript and Go client libraries - Pricing: Pay-per-use ($0.07/CPU-hour, $0.04/GB-hour). $30 trial credits.
Cloudflare Sandbox SDK
Section titled “Cloudflare Sandbox SDK”Secure code execution in isolated containers, built on Cloudflare’s Workers platform.
- Isolation: Containers (not microVMs) on Cloudflare’s serverless runtime
- Languages: Python, JavaScript/TypeScript, shell commands
- Persistence: R2 bucket mounting as local filesystem paths
- API: TypeScript SDK (
getSandbox(),exec(),runCode(), file ops, WebSocket) - Integration: Claude generates code, Sandbox executes it, results return as text/visualizations
- Pricing: Workers Paid plan required. Based on Containers platform pricing.
- Tutorial: developers.cloudflare.com/sandbox/tutorials/claude-code/
Vercel Sandboxes
Section titled “Vercel Sandboxes”Source: vercel.com/docs/vercel-sandbox/
Ephemeral Linux microVMs for AI agents and code generation, GA since 2026-01-30.
- Isolation: Firecracker microVMs, isolated from env vars, DBs, and cloud resources
- Performance: Sub-second initialization, automatic termination on task completion
- Timeouts: Default 5 min, Hobby up to 45 min, Pro/Enterprise up to 5 hours
- SDK:
Sandbox,Command,Snapshotclasses. Filesystem snapshots for faster repeated runs. - Auth: Vercel OIDC tokens (recommended) or access tokens for external CI/CD
- Integration: Works with Claude’s Agent SDK for autonomous agent tasks
Source: e2b.dev
Open-source sandbox platform for AI agents and LLM applications.
- Isolation: Firecracker microVMs (same technology as AWS Lambda)
- Performance: ~150ms cold boot, under 25ms standby resume
- Custom images: Up to 10GB, boot in under 2 seconds (Blueprints)
- Snapshots: Capture and restore full VM state
- Languages: Python, JavaScript, Ruby, C++, anything on Linux. LLM-agnostic.
- Integrations: LangChain, LangGraph, LlamaIndex, Vercel/Next.js, Ollama
- Deployment: Cloud-hosted, BYOC (AWS/GCP/Azure), self-hosted on-premises/VPC
- Pricing: Free tier ($100 credits, 1h max), Pro from $150/month (24h max)
Native Claude Code Sandbox Mode
Section titled “Native Claude Code Sandbox Mode”Claude Code’s built-in process-level sandboxing (Layer 4 in the architecture).
- No external dependencies: Works out of the box
- Process isolation: Restricts what commands Claude can execute
- Configurable: Through
allowedToolsin settings - Limitations: Not full VM isolation — shares host kernel and filesystem
Use this when: Docker is unavailable, lightweight isolation is sufficient, or you want defense-in-depth alongside a sandbox.
6. Comparison Matrix
Section titled “6. Comparison Matrix”| Criterion | Docker Sandboxes | Native CC | Fly.io Sprites | Cloudflare SDK | E2B | Vercel Sandboxes |
|---|---|---|---|---|---|---|
| Isolation level | microVM (hypervisor) | Process (Seatbelt/bubblewrap) | Firecracker microVM | Container | Firecracker microVM | Firecracker microVM |
| Kernel isolation | ✅ Separate kernel | ❌ Shared kernel | ✅ Separate kernel | Partial | ✅ Separate kernel | ✅ Separate kernel |
| Runs locally | Yes | Yes | No (cloud) | No (cloud) | No (cloud) | No (cloud) |
| Setup | Docker Desktop 4.58+ | 0 deps (macOS), 2 pkgs (Linux) | API key | Workers Paid | API key | SDK |
| Docker-in-Docker | ✅ Private daemon | ❌ Not supported | Yes | No | Yes | Yes |
| Network control | Allow/Deny lists | Allow/Deny lists (SOCKS5) | L3 egress policies | Not detailed | Not detailed | Not detailed |
| Platform | macOS, Windows (WSL2) | macOS, Linux, WSL2 | Any (API) | Any (Workers) | Any (API/SDK) | Any (SDK) |
| Overhead | Moderate (~5-10% CPU) | Minimal (~1-3% CPU) | Cloud | Cloud | Cloud | Cloud |
| Free tier | Docker Desktop | Free | $30 credits | Workers Paid | $100 credits | Yes (limited) |
| Best for | Max security, Docker needed | Daily dev, trusted code | API-driven agents | Serverless | Multi-framework | Next.js/Vercel |
7. Safe Autonomy Workflows
Section titled “7. Safe Autonomy Workflows”Pattern: Docker Sandbox + —dangerously-skip-permissions
Section titled “Pattern: Docker Sandbox + —dangerously-skip-permissions”The recommended pattern for local autonomous development:
# 1. Create a sandbox with your projectdocker sandbox create my-feature ~/my-project
# 2. Configure network (optional, recommended for security)docker sandbox network proxy my-feature \ --policy deny \ --allow-host api.anthropic.com \ --allow-host "*.npmjs.org" \ --allow-host github.com
# 3. Run Claude autonomously (safe inside sandbox)docker sandbox run my-feature -- --dangerously-skip-permissions \ "Refactor the auth module to use JWT. Run all tests before finishing."
# 4. Review changes on host (workspace syncs automatically)cd ~/my-project && git diff
# 5. If satisfied, commit. If not, discard or re-run.git add -A && git commit -m "feat: JWT auth (sandbox-generated)"Pattern: CI/CD Pipeline with Sandbox
Section titled “Pattern: CI/CD Pipeline with Sandbox”Sketch for GitHub Actions:
jobs: agent-task: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4
- name: Run Claude in E2B sandbox uses: e2b-dev/e2b-github-action@v1 with: api-key: ${{ secrets.E2B_API_KEY }} command: | claude --dangerously-skip-permissions \ -p "Run the full test suite and fix any failures"For CI/CD, cloud sandboxes (E2B, Vercel, Sprites) are typically better than Docker Sandboxes since they don’t require Docker Desktop.
8. Anti-Patterns
Section titled “8. Anti-Patterns”| Anti-pattern | Why it’s dangerous | Do instead |
|---|---|---|
--dangerously-skip-permissions without sandbox | Agent has unrestricted access to host filesystem, network, and Docker | Use a sandbox as the security boundary |
| Assuming containers = VMs | Containers share the host kernel. A container escape exposes the host. | Use microVM-based solutions (Docker Sandboxes, E2B, Sprites) for strong isolation |
| Mounting entire filesystem into sandbox | Defeats the purpose of isolation. Agent can access credentials, SSH keys, etc. | Mount only the project workspace directory |
Allowlisting * in network policies | Agent can exfiltrate data to any endpoint | Use denylist mode with explicit allowances |
Skipping git diff review after sandbox run | Autonomous agent may have made unintended changes | Always review diffs before committing sandbox-generated code |
| Using sandbox as excuse to skip code review | Isolation protects the host, not code quality | Sandbox + code review are complementary, not alternatives |
See Also
Section titled “See Also”- architecture.md — Layer 4 (Sub-Agent Architecture) and permission model
- security-hardening.md — MCP vetting, injection defense, CVE tracking
- code.claude.com/docs/en/sandboxing — Official Claude Code sandbox docs
- docs.docker.com/ai/sandboxes/ — Docker Sandboxes documentation