OpenAI Model Disproves 80-Year-Old Erdős Geometry Conjecture · history

Version 6

2026-05-25 02:46 UTC · 196 items

Changes since v5

The most significant new development is OpenAI's publication of a formal technical PDF titled 'Planar Point Sets with Many Unit Distances' [^19371], which moves the story from announcement-phase to scrutiny-phase and gives mathematicians an OpenAI-authored document to analyze directly. Sebastien Bubeck [^19372], Noam Solomon [^19348], and Robert Talbert [^19346] have emerged as newly named commentators, broadening the credentialed engagement. Forbes [^19354] has joined mainstream coverage with a framing centered on mathematician reactions rather than superlatives. The Polish mathematician singularity claim has gained more context — 20 years spent designing the benchmark problems [^18796] — making it a more substantiated position than it initially appeared. A GPT-5.5 vs. GPT-5.4 FrontierMath comparison [^18797] and Greg Burnham's FrontierMath analysis [^19356] add new threads to the benchmark sub-narrative.

What

An OpenAI reasoning model disproved the planar unit distance conjecture — open since Paul Erdős posed it in 1946 — by constructing a counterexample using a surprising algebraic number theory method [1][2]. OpenAI has now published a formal PDF titled 'Planar Point Sets with Many Unit Distances' [4], providing technical documentation beyond the original announcement. Sebastien Bubeck has engaged on X [12], Forbes has published a dedicated piece framing the result around mathematician reactions [13], and named mathematicians including Noam Solomon have commented publicly [10]. A parallel story has crystallized around GPT-5.4 solving an open FrontierMath problem — with a Polish mathematician who spent 20 years designing such problems calling it 'singularity' [27][28] — while a GPT-5.5 vs. GPT-5.4 FrontierMath comparison has emerged [29], suggesting the benchmark story is accelerating independently.

Why it matters

The publication of the OpenAI technical PDF [4] marks the transition from announcement-phase to scrutiny-phase: mathematicians now have a document to analyze, not just a press release. Whether that document satisfies the community's standards for an independently verifiable disproof is the proximate question on which the story's long-term credibility depends — and the answer will set a precedent for how AI mathematical claims are evaluated going forward.

Open questions

Does the OpenAI PDF 'Planar Point Sets with Many Unit Distances' [4] present the counterexample in sufficient mathematical detail for independent verification, or does it describe without fully exposing the construction? This is now the central question for the result's credibility.
What does Sebastien Bubeck's engagement [12] reveal — does he endorse the method, raise reservations, or simply amplify? His research background and prior OpenAI affiliation make his assessment particularly informative.
Greg Burnham's 'What I Wish I Knew About FrontierMath' [34] likely contains critical or insider analysis of the benchmark — does it address evaluator independence, the structure of open problems, or OpenAI's funding relationship with the benchmark?
The GPT-5.5 vs. GPT-5.4 FrontierMath comparison [29] suggests benchmark scores are advancing in real time — does GPT-5.5 raise the 38% mark further, and does accelerating benchmark progress strengthen or deepen the evaluator-independence concern?

Narrative

On May 20, 2026, OpenAI announced that one of its general-purpose reasoning models had disproved the planar unit distance conjecture in discrete geometry, a problem first posed by Paul Erdős in 1946 [1]. The model produced a counterexample rather than a constructive proof, establishing that the conjecture is false. Within approximately one day, a preprint titled 'Remarks on the disproof of the unit distance conjecture' appeared on arXiv [2][3], and OpenAI subsequently published its own formal PDF titled 'Planar Point Sets with Many Unit Distances' [4], providing the first directly OpenAI-authored technical documentation of the mathematics. The method involved a bridge between algebraic number theory and plane geometry that multiple commentators treated as the intellectually surprising core of the result — more significant than the bare fact of disproving an 80-year-old conjecture [5][6].

The result has drawn engagement across the mathematical community. Gil Kalai, a prominent combinatorialist, called it 'amazing' via both X and his blog [7][8]. Terence Tao posted on Mathstodon with partial text beginning 'I was able to use an extended…', indicating direct engagement with the approach [9]. Noam Solomon, a mathematician, posted on LinkedIn [10], and Robert Talbert observed the character of the broader mathematician response [11]. Sebastien Bubeck, an ML researcher known for foundational work on reasoning in language models, has also engaged on X [12]. Forbes published a dedicated piece titled 'The AI Breakthrough That Has Mathematicians Paying Attention' [13], joining Scientific American, The Guardian, and New Scientist in mainstream coverage [14][15][16]. Michael Harris at Columbia has published multiple Silicon Reckoner posts on the Erdős result, 'artificial intuition,' and the expanding AI mathematics conversation [17][18][19], building a sustained critical engagement that goes beyond any single event — and carrying particular weight given his prior characterization of OpenAI's FrontierMath involvement as a 'scandal' [20]. Communities ranging from Reddit r/slatestarcodex [21] to r/MachineLearning [22] have hosted extended discussion, the latter notably framing the result under the skeptical word 'claims.'

A parallel and distinct development has run alongside the Erdős story. GPT-5.4 achieved 38% on the FrontierMath benchmark [23] and solved an open problem from that benchmark, a result that was subsequently autoformalized — converted to a formal proof in a proof assistant — providing a degree of machine-checkable credibility [24][25][26]. Quantum Zeitgeist detailed the reaction of a Polish mathematician who spent 20 years designing the benchmark's hard problems and described GPT-5.4's solution as 'singularity' [27][28]. A GPT-5.5 vs. GPT-5.4 FrontierMath comparison has also surfaced [29], suggesting benchmark progress is accelerating faster than the evaluation infrastructure can stabilize. Epoch AI administers the benchmark and has reported these records [30][31], though questions about evaluator independence persist given OpenAI's prior funding relationship [20][32][33]. Greg Burnham's substack piece 'What I Wish I Knew About FrontierMath' [34] may provide the most detailed outside analysis of the benchmark's structure and limitations to date.

Both stories arrive in a broader context of accelerating AI mathematical ambition. Quanta Magazine declared 'The AI Revolution in Math Has Arrived' in April 2026 [35], UC Irvine and USC received a $2.6 million DARPA grant for AI-driven mathematics on May 18 [36], and an arXiv survey on 'AI for Mathematics: Progress, Challenges, and Prospects' [37] frames the moment as a recognized inflection point. MIRI Berkeley has amplified the unit distance result specifically under the framing of 'autonomous' AI mathematical agency [38][39], treating the result as a capability and safety milestone rather than a mathematical one. The convergence of the Erdős disproof, the OpenAI technical PDF, the GPT-5.4 FrontierMath open-problem solution, and accelerating benchmark scores within a single week has compressed what seemed like a gradual trend into something the community is visibly struggling to absorb — with formal verification, peer credentialing, and evaluator independence all unresolved in parallel.

Timeline

1946: Paul Erdős first poses the planar unit distance conjecture in discrete geometry [6]
2026-04: Quanta Magazine publishes 'The AI Revolution in Math Has Arrived,' establishing AI mathematical capabilities as a recognized trend [35]
2026-05-18: UC Irvine and USC announce a $2.6 million DARPA grant for AI-driven mathematics [36]
2026-05-20: OpenAI announces a general-purpose reasoning model has disproved the unit distance conjecture via counterexample [1]
2026-05-21: ArXiv preprint 'Remarks on the disproof of the unit distance conjecture' appears; Gil Kalai endorses the result; Alex Dimakis's commentary is widely retweeted; The Guardian and Scientific American publish mainstream coverage [2][3][7][8][50][14][15]
2026-05-21: Po-Shen Loh, Zvi Mowshowitz, and other named commentators weigh in; Reddit r/math and Hacker News host extended discussion [51][48][52][53]
2026-05-22: Michael Harris publishes Silicon Reckoner posts on the Erdős result and 'artificial intuition'; New Scientist joins mainstream coverage; MIRI Berkeley amplifies result as 'autonomously disproved'; Epoch AI reports GPT-5.4 set a new FrontierMath record; Forbes publishes 'The AI Breakthrough That Has Mathematicians Paying Attention'; benchmark credibility controversy surfaces [17][18][16][38][31][20][32][33][13]
2026-05-23: Terence Tao posts on Mathstodon engaging with the approach; Sebastien Bubeck engages on X; Reddit r/MachineLearning hosts skeptical discussion; GPT-5.4 solves an open FrontierMath problem, which is autoformalized; Polish mathematician calls the result 'singularity' [22][9][12][39][43][47][24][25][27][28]
2026-05-24: OpenAI publishes formal PDF 'Planar Point Sets with Many Unit Distances'; Noam Solomon and Robert Talbert comment publicly; Michael Harris publishes 'The conversation on AI mathematics is expanding'; GPT-5.5 vs. GPT-5.4 FrontierMath comparison emerges; Reddit r/slatestarcodex and r/singularity host discussion; Greg Burnham publishes FrontierMath analysis [4][10][11][19][29][21][54][34][23]

Perspectives

OpenAI

Presents the unit distance disproof as a landmark milestone; has now published a formal technical PDF 'Planar Point Sets with Many Unit Distances' providing mathematical documentation; promotes GPT-5.4 for science and mathematics achievements including FrontierMath records; provides no methodological caveats in public announcements

Evolution: deepened — the publication of the technical PDF [4] moves beyond press-release promotion toward at least formal documentation, though the adequacy of that documentation for independent verification remains open

[1][40][41][4]

Terence Tao (mathematician, Fields Medal)

Has posted on Mathstodon with partial text 'I was able to use an extended…' indicating direct engagement with the approach or a related extension; the full extent of his assessment is not captured in available sources

Evolution: consistent — highest-credential mathematical engagement in the thread, though the publicly available fragment remains partial

[9]

Gil Kalai (mathematician, combinatorics blogger)

Enthusiastically endorses the result via both X and his WordPress blog, calling it 'amazing' and crediting AI directly; his engagement from inside the relevant mathematical community lends the result significant credibility

Evolution: consistent

[7][8][42]

Sebastien Bubeck (ML researcher, former Microsoft Research)

Has engaged on X regarding the unit distance result; the specific content of his assessment is not captured in available sources, but his engagement adds a technically credentialed ML perspective

Evolution: new — first appearance in the thread

[12]

Noam Solomon (mathematician)

Has posted on LinkedIn in response to the OpenAI unit distance result; specific stance not captured in available sources, but his named engagement from a mathematical background adds to the growing mathematician commentary

Evolution: new — first appearance in the thread

[10]

Michael Harris (mathematician, Silicon Reckoner blogger)

Has published multiple posts specifically addressing the Erdős result, 'artificial intuition,' and the broader AI mathematics conversation, building a sustained critical and epistemological engagement; previously characterized OpenAI's FrontierMath benchmark involvement as a 'scandal'

Evolution: deepened further — now three posts, suggesting running commentary rather than a reaction to a single event

[17][18][19][20]

MIRI Berkeley (AI safety research organization)

Amplifying the unit distance result under the framing of autonomous AI mathematical agency — treating the result as significant for AI capability and safety discourse

Evolution: consistent

[38][39][43][44]

Epoch AI

Reports GPT-5.4 set a new FrontierMath record at 38% and solved an open FrontierMath problem, adding benchmark evidence for OpenAI's math capability claims while also reopening questions about evaluator independence

Evolution: consistent

[30][23][31][45][46]

Polish mathematician (Dr_Singularity / subject of Quantum Zeitgeist piece)

Spent 20 years designing the hard problems in the FrontierMath benchmark and describes GPT-5.4's solution of an open problem from that benchmark as 'singularity' — the most maximalist framing from any named actor with direct domain expertise over the benchmark

Evolution: deepened — the backstory of 20 years of problem design makes the 'singularity' characterization more substantively grounded than it initially appeared

[27][28][47]

Greg Burnham (Lemmata substack)

Published 'What I Wish I Knew About FrontierMath,' likely providing critical or inside analysis of the benchmark's structure and limitations; specific conclusions not captured in available sources

Evolution: new — first appearance in the thread

[34]

Robert Talbert (mathematician)

Has observed and commented on the character of the mathematical community's response to the OpenAI result; specific framing not captured in available sources

Evolution: new — first appearance in the thread

[11]

Forbes

Frames the unit distance result specifically around mathematician reactions, with a title emphasizing that mathematicians are 'paying attention' — a framing distinct from other mainstream outlets that led with 'biggest breakthrough'

Evolution: new outlet — adds a major business-press voice to coverage dominated by science and tech media

[13]

Scientific American / The Guardian / New Scientist

Characterize the unit distance result as AI's biggest or most significant mathematical breakthrough yet; quote mathematicians expressing amazement

Evolution: consistent

[15][14][16]

Reddit r/MachineLearning / r/slatestarcodex

Both communities hosting skeptical or measured discussion; r/MachineLearning frames the result under the word 'claims,' signaling community-level wariness; r/slatestarcodex discussion reflects rationalist community interest in capability implications

Evolution: r/slatestarcodex is new; r/MachineLearning stance consistent

[22][21]

TechCrunch / TechRepublic

Documenting the FrontierMath benchmark controversy: o3 scored lower on independent assessments than OpenAI initially implied, and OpenAI secretly funded the benchmark before achieving records on it

Evolution: consistent

[32][33]

Zvi Mowshowitz

Cautiously impressed; calls this the first AI math result he finds genuinely impressive, embedding it within broader capability and safety commentary

Evolution: consistent

[48]

Rohan Paul (@rohanpaul_ai)

Bullish; reads the result as evidence that test-time compute on a general-purpose LLM is sufficient for research-grade mathematical output without specialized architecture

Evolution: consistent

[5]

Tensions

Transparency gap: OpenAI presents the unit distance disproof as a clean milestone [1][40] and has now published a formal PDF [4], but the broader commentary — from Alex Dimakis to the arXiv preprint — treats the novel algebraic number theory method as the central puzzle that remains poorly explained publicly [5][6][2]. Whether the PDF closes this gap is now the live question. [1][40][4][5][6][2]
Autonomy framing: MIRI Berkeley and AI-capability amplifiers emphasize that the model 'autonomously disproved' the conjecture [38][39][44], while William Jin explicitly cautions that 'monumental' does not equal AGI [49] and the full degree of human scaffolding and problem-framing that shaped the result remains undisclosed by OpenAI [1]. [38][39][44][49][1]
Evaluator independence: Michael Harris labeled OpenAI's FrontierMath benchmark involvement a 'scandal' [20], and TechCrunch and TechRepublic have documented benchmark discrepancies in OpenAI's math claims [32][33]. Epoch AI's reports that GPT-5.4 achieved 38% and solved an open benchmark problem [30][23] deepen this tension — they may represent independent validation or another iteration of the same conflict-of-interest pattern. A GPT-5.5 comparison now emerging [29] may extend the pattern further. [20][32][33][30][23][29]
What kind of cognition is this? Michael Harris is explicitly exploring 'artificial intuition' and an expanding AI mathematics conversation [18][19], while Rohan Paul argues that test-time compute on a general-purpose LLM suffices for frontier discovery [5] — two framings with very different implications for how AI mathematical results should be credited and built upon. Sebastien Bubeck's engagement [12] may add a technically grounded position to this debate. [18][19][5][12]
Verification asymmetry: GPT-5.4's FrontierMath open-problem solution was autoformalized — converted to machine-checkable form [24][25][26] — while the Erdős unit distance disproof, now documented in the OpenAI PDF [4] and arXiv preprint [2], still awaits confirmed peer-reviewed journal submission or independent formal verification. This creates a credibility gap between the two AI math milestones despite both being attributed to OpenAI models. [24][25][26][4][2]
Maximalism vs. calibration: The Polish mathematician who designed FrontierMath's hardest problems calls GPT-5.4's solution 'singularity' [27][28] — the most maximalist framing from anyone with direct domain expertise over the specific challenge. This stands in direct tension with William Jin's explicit 'monumental but not AGI' caution [49] and the measured skepticism of technical communities [22][21]. [27][28][49][22][21]

Sources

[1] An OpenAI model has disproved a central conjecture in discrete geometry — OpenAI Blog (2026-05-20)
[2] [2605.20695] Remarks on the disproof of the unit distance conjecture — reactive:openai-erdos-math-breakthrough
[3] [PDF] Remarks on the disproof of the unit distance conjecture - arXiv — reactive:openai-erdos-math-breakthrough
[4] Planar Point Sets with Many Unit Distances — reactive:openai-erdos-math-breakthrough
[5] A general-purpose LLM can produce frontier research when given enough test-time compute. — Rohan Paul Twitter (2026-05-21)
[6] This is WILD! — Milk Road AI Twitter (2026-05-21)
[7] An internal model of Open AI disproved Erdos unit distance conjecture. — reactive:openai-erdos-math-breakthrough
[8] Unit Distance Problem | Combinatorics and more — reactive:openai-erdos-math-breakthrough
[9] Terence Tao: "I was able to use an extended …" - Mathstodon.xyz — reactive:openai-erdos-math-breakthrough
[10] Noam Solomon's Post — reactive:openai-erdos-math-breakthrough
[11] Very interesting. The response to this by mathematicians ... — reactive:openai-erdos-math-breakthrough
[12] Sebastien Bubeck (@SebastienBubeck) on X — reactive:openai-erdos-math-breakthrough
[13] The AI Breakthrough That Has Mathematicians Paying Attention — reactive:openai-erdos-math-breakthrough
[14] OpenAI makes breakthrough on 80-year-old maths problem — reactive:openai-erdos-math-breakthrough
[15] OpenAI announces AI's biggest math breakthrough yet — reactive:openai-erdos-math-breakthrough
[16] Mathematicians stunned by AI's biggest breakthrough in ... — reactive:ai-formal-math-breakthroughs
[17] About that Erdős problem - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[18] About "artificial intuition" - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[19] The conversation on AI mathematics is expanding - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[20] The FrontierMath scandal - by Michael Harris — reactive:openai-erdos-math-breakthrough
[21] An OpenAI model has disproved a central conjecture in ... — reactive:openai-erdos-math-breakthrough
[22] OpenAI claims a general-purpose reasoning model found a ... - Reddit — reactive:openai-erdos-math-breakthrough
[23] GPT-5.4 Pro Hits 38% on FrontierMath, Why This Matters? — reactive:openai-erdos-math-breakthrough
[24] GPT-5.4 solves its first open math problem from FrontierMath benchmark — reactive:openai-erdos-math-breakthrough
[25] GPT-5.4 Pro solved the first Open Frontier Math problem. It was then ... — reactive:openai-erdos-math-breakthrough
[26] Process-Driven Autoformalization in Lean 4 - OpenReview — reactive:openai-erdos-math-breakthrough
[27] Tough Math Problem Convinces Mathematician the Singularity Is Here — reactive:openai-erdos-math-breakthrough
[28] A Polish mathematician spent 20 years designing very hard ... — reactive:openai-erdos-math-breakthrough
[29] Alex runs the numbers on GPT-5.5 vs 5.4 and says frontier math is ... — reactive:openai-erdos-math-breakthrough
[30] GPT-5.4 set a new record on FrontierMath, our benchmark of ... — reactive:openai-erdos-math-breakthrough
[31] GPT-5.4 set a new record on FrontierMath - Epoch AI — reactive:openai-erdos-math-breakthrough
[32] OpenAI's o3 AI model scores lower on a benchmark ... - TechCrunch — reactive:openai-erdos-math-breakthrough
[33] OpenAI's o3: AI Benchmark Discrepancy Reveals Gaps in ... — reactive:openai-erdos-math-breakthrough
[34] What I Wish I Knew About FrontierMath - by Greg Burnham — reactive:openai-erdos-math-breakthrough
[35] The AI Revolution in Math Has Arrived | Quanta Magazine — reactive:openai-erdos-math-breakthrough
[36] UC Irvine, USC receive $2.6 million DARPA grant for AI to drive math breakthroughs – UC Irvine News — reactive:openai-erdos-math-breakthrough
[37] AI for Mathematics: Progress, Challenges, and Prospects - arXiv — reactive:openai-erdos-math-breakthrough
[38] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[39] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[40] The Erdős Breakthrough | OpenAI | 166 comments - LinkedIn — reactive:openai-erdos-math-breakthrough
[41] Advancing science and math with GPT-5.2 | OpenAI — reactive:openai-erdos-math-breakthrough
[42] Gil Kalai on the new AI proof of the Erdős unit distance problem — reactive:openai-erdos-math-breakthrough
[43] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[44] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[45] Epoch AI's Post - LinkedIn — reactive:openai-erdos-math-breakthrough
[46] FrontierMath: LLM Benchmark for Advanced AI Math Reasoning | Epoch AI — reactive:openai-erdos-math-breakthrough
[47] GPT-5.4 Just Cracked a 20-Year Math Problem — reactive:openai-erdos-math-breakthrough
[48] AI #169: New Knowledge — Zvi's AI Roundups (2026-05-21)
[49] @OpenAI This feels monumental. A general-purpose reasoning model making a frontier-level math contribution isn’t AGI, bu... — reactive:openai-erdos-math-breakthrough (2026-05-21)
[50] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[51] Po-Shen Loh's Post - LinkedIn — reactive:openai-erdos-math-breakthrough
[52] OpenAI's internal model disproves Unit Distance Conjecture of Erdos — reactive:openai-erdos-math-breakthrough
[53] An OpenAI model has disproved a central conjecture in discrete ... — reactive:openai-erdos-math-breakthrough
[54] An EpochAI Frontier Math open problem may have been solved for ... — reactive:openai-erdos-math-breakthrough