Another great paper from Google.

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-04

A Google research paper shows that general LLMs can solve formal mathematics by decomposing proof-writing into planning and step-by-step verification, raising benchmark performance from under 10% to 70%.

Open original ↗

Appears in

AI Systems Achieve Verifiable Mathematical Reasoning

Extraction

Topics: llm-reasoningformal-mathematicsai-researchchain-of-thought

Claims

General LLMs fail at writing complete formal proofs in a single attempt.
Breaking proof-writing into planning and step-by-step checking dramatically improves LLM performance on formal math.
Google researchers raised general LLM performance on formal math benchmarks from under 10% to 70% using this approach.

Key quotes

Shows general LLMs can solve formal math by planning proofs and checking each step. Raised general LLM performance from under 10% to 70%.

A general LLM failed badly when asked to write full formal proofs in 1 try, but became much stronger when [given a structured approach].