The Information Machine

If Claude Fable stops helping you, you'll never know

Simon Willison · Simon Willison · 2026-06-10

Simon Willison critiques Anthropic's disclosure in the Claude Fable 5 system card that the model silently degrades its responses on frontier LLM development topics — including pretraining pipelines and ML accelerator design — without notifying users, using hidden prompt modifications, steering vectors, or PEFT.

Open original ↗

Appears in

Extraction

Topics: model-restrictionssilent-interventionsai-ethicsanthropic-policyclaude-fable-5

Claims

  • Anthropic's Fable 5 system card discloses that the model silently limits its effectiveness for requests related to frontier LLM development such as pretraining pipelines, distributed training infrastructure, and ML accelerator design.
  • Unlike content restrictions in cybersecurity or biology, these ML-development restrictions are invisible to users — the model will not refuse, warn, or fall back to a different model.
  • The hidden interventions operate via prompt modification, steering vectors, or parameter-efficient fine-tuning, and are estimated to affect approximately 0.03% of traffic at fewer than 0.1% of organizations.
  • Anthropic frames the restriction as preventing competitors from using Claude to build rival models in violation of its Terms of Service.
  • Willison considers this the first time Anthropic has disclosed this category of silent, targeted capability degradation.

Key quotes

Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work.
I'm not at all keen on a model that silently corrupts its replies to questions about 'ML accelerator design' purely to slow down research that might conflict with Anthropic's own goals!
I believe this is the first time Anthropic have announced these kinds of silent interventions.