OpenAI Model Disproves 80-Year-Old Erdős Geometry Conjecture · history

Version 5

2026-05-24 19:34 UTC · 172 items

Changes since v4

The GPT-5.4 FrontierMath story has crystallized into a distinct sub-narrative: a specific benchmark score of 38% [^17605], and the milestone of solving an open FrontierMath problem that was subsequently autoformalized [^17606][^17607] — a Polish mathematician reportedly called this 'singularity' [^17603]. This is now a second AI math milestone running in parallel with the Erdős disproof, and it introduces a new tension: the FrontierMath solution was formally machine-checked while the Erdős counterexample remains unverified by those standards. Michael Harris has published a third Silicon Reckoner post titled 'The conversation on AI mathematics is expanding' [^17598], suggesting his engagement is becoming a sustained running commentary rather than a reaction to a single event.

What

An OpenAI general-purpose reasoning model disproved the planar unit distance conjecture — a problem open since Paul Erdős posed it in 1946 — by constructing a counterexample, with a formal arXiv preprint providing the mathematical basis [1][2]. Terence Tao has posted on Mathstodon engaging with the approach [11], and Michael Harris has published multiple Silicon Reckoner pieces on the result, 'artificial intuition,' and now a broader post titled 'The conversation on AI mathematics is expanding' [12][13][14]. A parallel development has compounded the story: GPT-5.4 has achieved 38% on the FrontierMath benchmark [19] and solved an open problem from that benchmark — subsequently autoformalized — which a Polish mathematician described as 'singularity' [22][20][21].

Why it matters

The Erdős disproof and the GPT-5.4 FrontierMath open-problem result are now two distinct AI math milestones arriving within days of each other, together shifting the question from whether AI can solve research-grade mathematics to how quickly independent verification, formal proof, and peer community consensus can keep pace. The autoformalization of GPT-5.4's FrontierMath solution is a meaningful step toward machine-verifiable credibility — the kind of process that the Erdős result still visibly lacks.

Open questions

What exactly did Terence Tao post on Mathstodon — does 'I was able to use an extended…' constitute endorsement, independent verification, or an extension of the method beyond the original result? [11]
Michael Harris's post 'The conversation on AI mathematics is expanding' [14] appears to broaden his analysis — does it address the unit distance result specifically, or does it pivot to the wider wave of AI math milestones including the GPT-5.4 FrontierMath open-problem result?
GPT-5.4 solved an open FrontierMath problem that was then autoformalized [20][21] — which open problem was this, was the autoformalization independently verified in a proof assistant, and does this represent the first AI-solved open problem that has also been formally machine-checked?
The arXiv preprint on the unit distance disproof [2] has not yet been confirmed as submitted to a peer-reviewed journal — are independent mathematicians beyond Kalai publicly confirming or challenging the counterexample's validity?

Narrative

On May 20, 2026, OpenAI announced that one of its general-purpose reasoning models had disproved the planar unit distance conjecture in discrete geometry, a problem first posed by Paul Erdős in 1946 [1]. The model produced a counterexample rather than a proof, establishing the conjecture is false. Within approximately one day, a preprint titled 'Remarks on the disproof of the unit distance conjecture' appeared on arXiv, providing the first publicly accessible formal documentation of the mathematics [2][3]. The method involved a surprising bridge between algebraic number theory and plane geometry, which multiple commentators treated as more significant than the bare fact of disproving an 80-year-old conjecture [4][5]. Gil Kalai, a prominent combinatorialist, endorsed the result via both X and his WordPress blog, calling it 'amazing' [6][7], and mainstream press including Scientific American, The Guardian, and New Scientist characterized it as AI's biggest mathematical breakthrough to date [8][9][10].

The result has drawn engagement at the highest level of professional mathematics. Terence Tao — widely regarded as among the greatest living mathematicians — posted on Mathstodon with partial text beginning 'I was able to use an extended…', indicating direct engagement with the approach or a related extension [11]. Michael Harris — Columbia mathematician and author of the Silicon Reckoner substack — has published multiple posts on the Erdős result, 'artificial intuition,' and most recently a piece titled 'The conversation on AI mathematics is expanding' [12][13][14], representing a sustained critical and epistemological engagement that goes beyond the single result. Harris had previously characterized OpenAI's FrontierMath benchmark involvement as a 'scandal' [15], lending his continued engagement particular weight in the credibility debate. MIRI Berkeley amplified the unit distance result under the framing 'An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry' [16][17], treating autonomous AI mathematical agency as the salient feature for AI safety and capability discourse. Reddit r/MachineLearning hosted discussion under the skeptical framing 'OpenAI claims a general-purpose reasoning model found a…' — the word 'claims' reflecting a persistent wariness in technical circles [18].

A parallel and distinct development has emerged alongside the Erdős story. GPT-5.4 has achieved 38% on the FrontierMath benchmark [19] and, more strikingly, solved an open problem from that benchmark — a result that was subsequently autoformalized, meaning converted to a formal proof in a proof assistant [20][21]. A Polish mathematician reportedly described the achievement as 'singularity' [22]. This autoformalization milestone is meaningful because it addresses, at least partially, the peer verification gap that dogs the Erdős result: a machine-checked formal proof is independently auditable in a way that a counterexample described in a preprint is not. Epoch AI, which administers FrontierMath, reported the GPT-5.4 benchmark record separately [23][24], though questions about evaluator independence persist given OpenAI's prior funding relationship with the benchmark [15][25][26].

Both results arrive within a broader wave of AI progress in mathematics. Quanta Magazine published 'The AI Revolution in Math Has Arrived' in April 2026 [27], and UC Irvine and USC received a $2.6 million DARPA grant for AI-driven mathematics on May 18, 2026 [28]. The convergence of the Erdős disproof and the GPT-5.4 FrontierMath open-problem solution within the same week has compressed the timeline of what seemed like an unfolding trend into something resembling an inflection point — with community skepticism, mathematical credentialing, and formal verification all scrambling to keep pace.

Timeline

1946: Paul Erdős first poses the planar unit distance conjecture in discrete geometry [5]
2026-04: Quanta Magazine publishes 'The AI Revolution in Math Has Arrived,' establishing AI mathematical capabilities as a recognized trend [27]
2026-05-18: UC Irvine and USC announce a $2.6 million DARPA grant for AI-driven mathematics breakthroughs [28]
2026-05-20: OpenAI announces a general-purpose reasoning model has disproved the unit distance conjecture via counterexample [1]
2026-05-21: ArXiv preprint 'Remarks on the disproof of the unit distance conjecture' appears; Gil Kalai endorses the result via X and his blog; Alex Dimakis's commentary is widely retweeted; The Guardian and Scientific American publish mainstream coverage [2][3][6][7][34][8][9]
2026-05-21: Po-Shen Loh, Zvi Mowshowitz, and other named commentators weigh in; Reddit r/math and Hacker News host extended discussion [41][40][42][43]
2026-05-22: Michael Harris publishes Silicon Reckoner posts on the Erdős result and 'artificial intuition'; New Scientist joins mainstream coverage; MIRI Berkeley amplifies result as 'autonomously disproved'; Epoch AI reports GPT-5.4 set a new FrontierMath record; benchmark credibility controversy surfaces via Harris and tech press [12][13][10][16][24][15][25][26]
2026-05-23: Continued broad social media amplification; Reddit r/MachineLearning hosts skeptical discussion; Terence Tao posts on Mathstodon engaging with the approach; GPT-5.4 solves an open FrontierMath problem, which is autoformalized; a Polish mathematician calls the result 'singularity' [18][11][17][32][33][22][20][21]
2026-05-24: Further amplification across social platforms; Michael Harris publishes 'The conversation on AI mathematics is expanding'; coverage of GPT-5.4's 38% FrontierMath score widens; math communities continue discussing the unit distance result [29][44][45][14][19][46]

Perspectives

OpenAI

Presents the unit distance disproof as a landmark milestone titled 'The Erdős Breakthrough'; promotes GPT-5.4 for science and mathematics achievements including FrontierMath records; provides no methodological caveats in public announcements

Evolution: consistent

[1][29][30]

Terence Tao (mathematician, Fields Medal)

Has posted on Mathstodon with partial text 'I was able to use an extended…' indicating direct engagement with the approach or a related extension; the full extent of his assessment is not captured in available sources

Evolution: consistent — highest-credential mathematical engagement in the thread, though the publicly available fragment remains partial

[11]

Gil Kalai (mathematician, combinatorics blogger)

Enthusiastically endorses the result via both X and his WordPress blog, calling it 'amazing' and crediting AI directly; his engagement from inside the relevant mathematical community lends the result significant credibility

Evolution: consistent

[6][7][31]

Michael Harris (mathematician, Silicon Reckoner blogger)

Has published multiple posts specifically addressing the Erdős result, 'artificial intuition,' and now the broader AI mathematics conversation, building a sustained critical and analytical engagement; previously characterized OpenAI's FrontierMath benchmark involvement as a 'scandal'

Evolution: deepened further — expanded from two Erdős-specific posts to a third post framing AI mathematics as an expanding conversation, suggesting his engagement is becoming a running commentary rather than a single-event response

[12][13][14][15]

MIRI Berkeley (AI safety research organization)

Amplifying both the unit distance result and broader AI math milestones under the framing of autonomous AI mathematical agency — treating the results as significant for AI capability and safety discourse

Evolution: consistent

[16][17][32][33]

Epoch AI

Reports GPT-5.4 set a new FrontierMath record at 38% and solved an open FrontierMath problem, adding benchmark evidence for OpenAI's math capability claims while also reopening questions about evaluator independence

Evolution: deepened — the specific 38% score and open-problem milestone are more concrete than the initial record report

[23][19][24]

Alex Dimakis (ML/information theory researcher)

Frames the result as a breakthrough in combinatorics; his original post became the most widely retweeted expert commentary on the result

Evolution: consistent

[34][35][36][37][38]

Scientific American / The Guardian / New Scientist

Characterize the unit distance result as AI's biggest or most significant mathematical breakthrough yet; quote mathematicians expressing amazement

Evolution: consistent

[9][8][10]

Reddit r/MachineLearning

Hosts discussion under the skeptical framing 'OpenAI claims a general-purpose reasoning model found a…' — the word 'claims' signals community-level wariness about OpenAI's self-reported math achievements

Evolution: consistent

[18]

TechCrunch / TechRepublic

Documenting the FrontierMath benchmark controversy: o3 scored lower on independent assessments than OpenAI initially implied, and OpenAI secretly funded the benchmark before achieving records on it

Evolution: consistent

[25][26]

William Jin (@WilliamJin06)

Measured enthusiasm: calls the result 'monumental' but explicitly distinguishes it from AGI, signaling a preference for calibrated rather than maximalist interpretation

Evolution: consistent

[39]

Zvi Mowshowitz

Cautiously impressed; calls this the first AI math result he finds genuinely impressive, embedding it within broader capability and safety commentary

Evolution: consistent

[40]

Rohan Paul (@rohanpaul_ai)

Bullish; reads the result as evidence that test-time compute on a general-purpose LLM is sufficient for research-grade mathematical output without specialized architecture

Evolution: consistent

[4]

Tensions

Transparency gap: OpenAI presents the unit distance disproof as a clean milestone framed as 'The Erdős Breakthrough' [1][29], while the broader commentary — from Milk Road AI to Alex Dimakis to the arXiv preprint — treats the novel algebraic number theory method as the central puzzle that remains poorly explained in the public announcement [4][5][2]. [1][29][4][5][2]
Autonomy framing: MIRI Berkeley and AI-capability boosters emphasize that the model 'autonomously disproved' the conjecture [16][17], while William Jin explicitly cautions that 'monumental' does not equal AGI [39] — a debate about what degree of human scaffolding and problem-framing shaped the result that OpenAI's sparse announcement does not resolve [1]. [16][17][39][1]
Evaluator independence: Michael Harris has explicitly labeled OpenAI's FrontierMath benchmark involvement a 'scandal' [15], and TechCrunch and TechRepublic have independently documented benchmark discrepancies in OpenAI's math claims [25][26]. Epoch AI's reports that GPT-5.4 achieved 38% on FrontierMath and solved an open benchmark problem [23][19][20] deepen this tension — it may represent independent validation or another iteration of the same conflict-of-interest pattern. [15][25][26][23][19][20]
What kind of cognition is this? Michael Harris is explicitly exploring 'artificial intuition' and an 'expanding conversation on AI mathematics' [13][14], while Rohan Paul and others argue that test-time compute on a general-purpose LLM is sufficient for frontier discovery [4] — two framings that carry very different implications for how results should be understood, credited, and built upon. [13][14][4]
Verification asymmetry: GPT-5.4's FrontierMath open-problem solution was autoformalized — converted to machine-checkable form [20][21] — while the Erdős unit distance disproof remains an arXiv preprint without confirmed peer-reviewed journal submission or independent formal verification [2]. This creates a credibility gap between the two AI math milestones despite both being attributed to OpenAI models. [20][21][2]

Sources

[1] An OpenAI model has disproved a central conjecture in discrete geometry — OpenAI Blog (2026-05-20)
[2] [2605.20695] Remarks on the disproof of the unit distance conjecture — reactive:openai-erdos-math-breakthrough
[3] [PDF] Remarks on the disproof of the unit distance conjecture - arXiv — reactive:openai-erdos-math-breakthrough
[4] A general-purpose LLM can produce frontier research when given enough test-time compute. — Rohan Paul Twitter (2026-05-21)
[5] This is WILD! — Milk Road AI Twitter (2026-05-21)
[6] An internal model of Open AI disproved Erdos unit distance conjecture. — reactive:openai-erdos-math-breakthrough
[7] Unit Distance Problem | Combinatorics and more — reactive:openai-erdos-math-breakthrough
[8] OpenAI makes breakthrough on 80-year-old maths problem — reactive:openai-erdos-math-breakthrough
[9] OpenAI announces AI's biggest math breakthrough yet — reactive:openai-erdos-math-breakthrough
[10] Mathematicians stunned by AI's biggest breakthrough in ... — reactive:ai-formal-math-breakthroughs
[11] Terence Tao: "I was able to use an extended …" - Mathstodon.xyz — reactive:openai-erdos-math-breakthrough
[12] About that Erdős problem - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[13] About "artificial intuition" - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[14] The conversation on AI mathematics is expanding - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[15] The FrontierMath scandal - by Michael Harris — reactive:openai-erdos-math-breakthrough
[16] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[17] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[18] OpenAI claims a general-purpose reasoning model found a ... - Reddit — reactive:openai-erdos-math-breakthrough
[19] GPT-5.4 Pro Hits 38% on FrontierMath, Why This Matters? — reactive:openai-erdos-math-breakthrough
[20] GPT-5.4 solves its first open math problem from FrontierMath benchmark — reactive:openai-erdos-math-breakthrough
[21] GPT-5.4 Pro solved the first Open Frontier Math problem. It was then ... — reactive:openai-erdos-math-breakthrough
[22] GPT-5.4 Just Cracked a 20-Year Math Problem — reactive:openai-erdos-math-breakthrough
[23] GPT-5.4 set a new record on FrontierMath, our benchmark of ... — reactive:openai-erdos-math-breakthrough
[24] GPT-5.4 set a new record on FrontierMath - Epoch AI — reactive:openai-erdos-math-breakthrough
[25] OpenAI's o3 AI model scores lower on a benchmark ... - TechCrunch — reactive:openai-erdos-math-breakthrough
[26] OpenAI's o3: AI Benchmark Discrepancy Reveals Gaps in ... — reactive:openai-erdos-math-breakthrough
[27] The AI Revolution in Math Has Arrived | Quanta Magazine — reactive:openai-erdos-math-breakthrough
[28] UC Irvine, USC receive $2.6 million DARPA grant for AI to drive math breakthroughs – UC Irvine News — reactive:openai-erdos-math-breakthrough
[29] The Erdős Breakthrough | OpenAI | 166 comments - LinkedIn — reactive:openai-erdos-math-breakthrough
[30] Advancing science and math with GPT-5.2 | OpenAI — reactive:openai-erdos-math-breakthrough
[31] Gil Kalai on the new AI proof of the Erdős unit distance problem — reactive:openai-erdos-math-breakthrough
[32] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[33] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[34] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[35] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[36] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[37] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[38] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-21)
[39] @OpenAI This feels monumental. A general-purpose reasoning model making a frontier-level math contribution isn’t AGI, bu... — reactive:openai-erdos-math-breakthrough (2026-05-21)
[40] AI #169: New Knowledge — Zvi's AI Roundups (2026-05-21)
[41] Po-Shen Loh's Post - LinkedIn — reactive:openai-erdos-math-breakthrough
[42] OpenAI's internal model disproves Unit Distance Conjecture of Erdos — reactive:openai-erdos-math-breakthrough
[43] An OpenAI model has disproved a central conjecture in discrete ... — reactive:openai-erdos-math-breakthrough
[44] 🚨 BREAKING: OpenAI just made history. One of their internal reasoning models autonomously disproved the 80-year-old Erdő... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[45] great article putting tgthr — reactive:openai-erdos-math-breakthrough (2026-05-24)
[46] Dense Graphs - Erdős unit distance conjecture proved false. — reactive:openai-erdos-math-breakthrough