How we contain Claude across products
Simon Willison · Simon Willison · 2026-05-30
Anthropic publishes a detailed technical overview of sandbox and containment techniques across Claude.ai, Claude Code, and Claude Cowork, covering gVisor, Seatbelt, Bubblewrap, and full VM isolation strategies, with Simon Willison highlighting it as unusually thorough security documentation.
Appears in
Extraction
Topics: llm-sandboxingagent-securityclaude-codeai-infrastructurecredential-exfiltration
Claims
- Anthropic uses gVisor for Claude.ai, platform-native sandboxing (Seatbelt on macOS, Bubblewrap on Linux) for Claude Code, and full virtual machines for Claude Cowork.
- Anthropic's core containment philosophy is to keep credentials outside the sandbox entirely, making exfiltration impossible regardless of whether the cause is a user, model behavior, or attacker.
- Anthropic documented a previously missed risk involving an api.anthropic.com/v1/files exfiltration vector.
- Anthropic's open-source Sandbox Runtime Tool (srt) has matured to the point that Simon Willison considers it worth a serious evaluation.
Key quotes
A complaint I often have about sandboxing products is that they are rarely thoroughly documented, and in the absence of detailed documentation it's hard to know how much I can trust them.
We constrain where and how an agent can act with process sandboxes, VMs, filesystem boundaries, and egress controls. The goal is to set a hard boundary on what an agent can reach.
if credentials never enter the sandbox, they can't be exfiltrated, regardless of whether the cause is a user, a model finding a 'creative' path, or an attacker.