The Information Machine

OpenAI Help: Lockdown Mode

Simon Willison · Simon Willison · 2026-06-05

Simon Willison analyzes OpenAI's newly launched ChatGPT Lockdown Mode, which blocks outbound network requests to cut off data exfiltration vectors in prompt injection attacks, and argues it correctly targets one leg of the 'Lethal Trifecta' security problem.

Open original ↗

Extraction

Topics: prompt-injectionllm-securitychatgptdata-exfiltration

Claims

  • OpenAI's Lockdown Mode blocks outbound network requests to prevent data exfiltration during prompt injection attacks.
  • Lockdown Mode does not prevent prompt injections from appearing in content ChatGPT processes—only their ability to exfiltrate data.
  • The feature uses deterministic mechanisms not evaluated by AI systems, making it resistant to adversarial subversion.
  • The 'Lethal Trifecta' in LLM security requires three simultaneous conditions: access to private data, exposure to untrusted content, and an exfiltration vector.
  • The existence of Lockdown Mode implies ChatGPT's default configuration does not robustly block determined data exfiltration attacks.

Key quotes

The Lethal Trifecta occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker.
It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks.
The existence of lockdown mode does however imply that ChatGPT, in its default settings, does not provide robust protection against sufficiently determined data exfiltration attacks!