Pythagoras-Prover just made Lean theorem proving look far less dependent on giant models, with a 4B prover beating DeepS…
Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-15
Pythagoras-Prover, a 4B parameter model, outperforms DeepSeek-Prover-V2-671B on the MiniF2F Pass@32 benchmark for Lean theorem proving, demonstrating that training data geometry can compensate for a 167x reduction in model scale.
Appears in
Extraction
Topics: formal-theorem-provinglean-proversmall-language-modelsdata-efficiencybenchmarks
Claims
- Pythagoras-Prover with 4B parameters beats DeepSeek-Prover-V2-671B on the MiniF2F Pass@32 benchmark.
- Better training data geometry can recover a substantial amount of the performance gap caused by reduced model scale.
- Formal reasoning tasks may be far less dependent on giant models than previously assumed.
Key quotes
Pythagoras-Prover just made Lean theorem proving look far less dependent on giant models, with a 4B prover beating DeepSeek-Prover-V2-671B at MiniF2F Pass@32.
Shows in formal reasoning, better data geometry can buy back an astonishing amount of scale.