Introducing Claude Opus 4.8

Anthropic News · 2026-05-28

Anthropic releases Claude Opus 4.8, an upgrade over Opus 4.7 that improves agentic task reliability, coding benchmarks, and honesty, while launching dynamic workflows in Claude Code and per-user effort controls at unchanged pricing.

Open original ↗

Appears in

Extraction

Topics: model-releaseagentic-aiai-alignmentlarge-language-modelsclaude

Claims

Claude Opus 4.8 is the only model to complete every case on the Super-Agent benchmark, beating Opus 4.7 and GPT-5.5 at cost parity.
Opus 4.8 scores 84% on Online-Mind2Web, outperforming both Opus 4.7 and GPT-5.5 on computer-use and browser-agent tasks.
Opus 4.8 is approximately four times less likely than Opus 4.7 to allow code flaws to pass unremarked, reflecting improved honesty.
Anthropic's alignment team found Opus 4.8 has substantially lower rates of misaligned behavior than Opus 4.7, comparable to Claude Mythos Preview.
Fast mode pricing for Opus 4.8 is three times cheaper than it was for previous Opus models.

Key quotes

Claude Opus 4.8 has noticeably better judgment. In Claude Code, it asks the right questions, catches its own mistakes, pushes back when a plan isn't sound, and builds up confidence around complex, multi-service explorations before making big changes.

Opus 4.8 is around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked.

Opus 4.8 'reaches new highs on our measures of prosocial traits like supporting user autonomy and acting in the user's best interest.'