GLM-5.2 got 22.8% on ARC-AGI-2:, $0.25/task

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-24

Zhipu AI's GLM-5.2 model scored 22.8% on the ARC-AGI-2 benchmark at $0.25 per task, a 7.6x improvement over the best frontier scores from May 2025, though still far behind GPT-5.5's 85%.

Open original ↗

Appears in

Rapid AI Benchmark Improvement: Small Models and New Entrants Closing Capability Gaps

Extraction

Topics: arc-agi-2model-benchmarksglm-5.2ai-reasoning

Claims

GLM-5.2 achieved 22.8% on ARC-AGI-2 at a cost of $0.25 per task.
The best verified models on ARC-AGI-2 scored only 3.0% as recently as May 2025.
GPT-5.5 currently leads the ARC-AGI-2 leaderboard at 85%.
GLM-5.2 represents approximately a 7.6x score improvement and 7.5x cost reduction compared to the May 2025 frontier.

Key quotes

GLM-5.2 got 22.8% on ARC-AGI-2: $0.25/task

around May 2025, the best verified models on ARC-AGI-2 were only at 3.0%.

GLM-5.2 is also about 7.6x above the best frontier score from May 2025, and about 7.5x cheaper