Infrastructure

Sandbox Environment

Definition

An isolated execution environment where an agent's code and tool calls run with restricted permissions, limited network access, and no ability to affect the host system or production data. Sandboxes are critical for agent safety because agents execute arbitrary tool calls — a bug or prompt injection could otherwise lead to data loss, credential exposure, or system damage. Sandbox implementations range from lightweight containers (Docker) to specialized agent sandboxes (E2B, Modal) that provide on-demand compute with automatic cleanup.

Builder Context

Every agent that executes code or runs shell commands must do so in a sandbox. This is non-negotiable for production. Use E2B or Modal for serverless agent sandboxes — they handle isolation, resource limits, and cleanup automatically. For self-hosted: Docker containers with restricted capabilities (no network by default, read-only filesystem, CPU/memory limits). The sandbox should outlive the agent's task just long enough to extract results, then be destroyed. Never reuse sandbox instances between different users or tasks.