OpenAI Model Disproves 80-Year-Old Erdős Geometry Conjecture · history

Version 7

2026-05-25 11:35 UTC · 199 items

What

An OpenAI reasoning model disproved the planar unit distance conjecture — open since Paul Erdős posed it in 1946 — by constructing a counterexample using algebraic number theory methods [1][2]. OpenAI has published a formal PDF titled 'Planar Point Sets with Many Unit Distances' [4] and a YouTube video of the same name [5], as the result reaches mainstream social media audiences [18]. The mathematical community's response now ranges from enthusiastic endorsement by Gil Kalai [8] and direct engagement from Terence Tao [10] to a Reddit r/math thread explicitly framing coverage as 'AI misinformation' [19] — a notable hardening from the core mathematics community. A parallel story involving GPT-5.4 achieving 38% on the FrontierMath benchmark and solving an open benchmark problem, subsequently autoformalized for machine-checkable verification [27][28], continues to accelerate.

Why it matters

The r/math 'AI misinformation' thread [19] represents the sharpest pushback yet from the mathematical community: not measured skepticism, but an explicit claim that something in how the result is being communicated is false or misleading. Whether the OpenAI PDF [4] can satisfy that scrutiny — and whether the result's credibility survives this accusation alongside endorsements from credentialed mathematicians — will determine how this episode is remembered in the broader narrative of AI mathematical capability.

Open questions

The Reddit r/math thread 'AI misinformation and Erdos problems' [19] makes an active accusation — what specifically is being disputed: the validity of the disproof itself, the autonomy framing, the media characterization of significance, or something else in how OpenAI described the result?
Does the OpenAI PDF 'Planar Point Sets with Many Unit Distances' [4] supply the algebraic number theory construction in sufficient detail for independent mathematical verification, or does it describe the result without fully exposing the method?
Will the misinformation framing on r/math [19] consolidate into a formal critique or published rebuttal, or will it be absorbed by the endorsements from Kalai [8][9] and Tao [10] and the arXiv preprint [2]?
The GPT-5.5 vs. GPT-5.4 FrontierMath comparison [32] suggests benchmark progress is accelerating — does GPT-5.5 surpass the 38% record, and does this widen the evaluator-independence concern or indicate the benchmark is being outpaced?

Narrative

On May 20, 2026, OpenAI announced that one of its general-purpose reasoning models had disproved the planar unit distance conjecture in discrete geometry, a problem first posed by Paul Erdős in 1946 [1]. The model produced a counterexample rather than a constructive proof, establishing that the conjecture is false. Within approximately one day, a preprint titled 'Remarks on the disproof of the unit distance conjecture' appeared on arXiv [2][3], and OpenAI subsequently published a formal PDF titled 'Planar Point Sets with Many Unit Distances' [4] alongside a YouTube video of the same name [5]. The method involved a bridge between algebraic number theory and plane geometry that multiple commentators identified as the intellectually surprising core of the result — more significant than the bare fact of disproving an 80-year-old conjecture [6][7].

The result has drawn a wide spectrum of responses. Gil Kalai, a prominent combinatorialist, called it 'amazing' via both X and his blog [8][9]. Terence Tao posted on Mathstodon with a fragment beginning 'I was able to use an extended…', indicating direct engagement with the approach [10]. Noam Solomon and Robert Talbert have each commented publicly [11][12], and Sebastien Bubeck, an ML researcher known for foundational work on reasoning in language models, has engaged on X [13]. Forbes joined mainstream coverage with 'The AI Breakthrough That Has Mathematicians Paying Attention' [14], while Scientific American, The Guardian, and New Scientist characterized the result as AI's most significant mathematical achievement yet [15][16][17], and the story has now spread to mainstream social media audiences [18]. Against this enthusiasm, a Reddit r/math thread titled 'AI misinformation and Erdos problems' [19] signals that some practicing mathematicians believe the result is being materially misrepresented — the sharpest pushback from the mathematical community's core. Michael Harris at Columbia has published multiple Silicon Reckoner posts on the Erdős result, 'artificial intuition,' and the expanding AI-mathematics conversation [20][21][22], and previously characterized OpenAI's FrontierMath involvement as a 'scandal' [23]. Reddit communities from r/slatestarcodex [24] to r/MachineLearning [25] have hosted extended discussion, the latter framing the result under the skeptical word 'claims.'

A parallel development has run alongside the Erdős story. GPT-5.4 achieved 38% on the FrontierMath benchmark [26] and solved an open problem from that benchmark — a result subsequently autoformalized, converted to a formal proof verifiable by a proof assistant [27][28][29]. A Polish mathematician who spent 20 years designing the benchmark's hardest problems described this as 'singularity' [30][31], the most maximalist framing from any named actor with direct domain expertise. A GPT-5.5 vs. GPT-5.4 FrontierMath comparison has emerged [32], suggesting benchmark progress is outpacing the evaluation infrastructure. Epoch AI administers the benchmark and has reported these records [33][34], though questions about evaluator independence persist given OpenAI's prior funding relationship [23][35][36]. Greg Burnham's substack piece 'What I Wish I Knew About FrontierMath' [37] provides the most detailed outside analysis of the benchmark's structure and limitations to date.

Both stories arrive in a broader context of accelerating AI mathematical ambition. Quanta Magazine declared 'The AI Revolution in Math Has Arrived' in April 2026 [38], UC Irvine and USC received a $2.6 million DARPA grant for AI-driven mathematics on May 18 [39], and MIRI Berkeley has amplified the unit distance result under the framing of autonomous AI mathematical agency [40][41], treating it as a capability and safety milestone. The convergence of the Erdős disproof, the OpenAI technical PDF, the GPT-5.4 FrontierMath open-problem solution, and the r/math misinformation accusation within the span of a week has compressed a gradual trend into something the mathematical community is visibly struggling to process — with formal verification, peer credentialing, evaluator independence, and now basic factual accuracy all contested in parallel.

Timeline

1946: Paul Erdős first poses the planar unit distance conjecture in discrete geometry [7]
2026-04: Quanta Magazine publishes 'The AI Revolution in Math Has Arrived,' establishing AI mathematical capabilities as a recognized trend [38]
2026-05-18: UC Irvine and USC announce a $2.6 million DARPA grant for AI-driven mathematics [39]
2026-05-20: OpenAI announces a general-purpose reasoning model has disproved the unit distance conjecture via counterexample [1]
2026-05-21: ArXiv preprint 'Remarks on the disproof of the unit distance conjecture' appears; Gil Kalai endorses the result; Alex Dimakis's commentary is widely retweeted; The Guardian and Scientific American publish mainstream coverage [2][3][8][9][53][15][16]
2026-05-21: Po-Shen Loh, Zvi Mowshowitz, and other named commentators weigh in; Reddit r/math and Hacker News host extended discussion [54][51][44][55]
2026-05-22: Michael Harris publishes Silicon Reckoner posts on the Erdős result and 'artificial intuition'; New Scientist joins mainstream coverage; MIRI Berkeley amplifies result as 'autonomously disproved'; Epoch AI reports GPT-5.4 set a new FrontierMath record; Forbes publishes 'The AI Breakthrough That Has Mathematicians Paying Attention'; benchmark credibility controversy surfaces [20][21][17][40][34][23][35][36][14]
2026-05-23: Terence Tao posts on Mathstodon engaging with the approach; Sebastien Bubeck engages on X; Reddit r/MachineLearning hosts skeptical discussion; GPT-5.4 solves an open FrontierMath problem, which is autoformalized; Polish mathematician calls the result 'singularity' [25][10][13][41][46][50][27][28][30][31]
2026-05-24: OpenAI publishes formal PDF 'Planar Point Sets with Many Unit Distances'; Noam Solomon and Robert Talbert comment publicly; Michael Harris publishes 'The conversation on AI mathematics is expanding'; GPT-5.5 vs. GPT-5.4 FrontierMath comparison emerges; Reddit r/slatestarcodex hosts discussion; Greg Burnham publishes FrontierMath analysis [4][11][12][22][32][24][56][37][26]
2026-05-25: Reddit r/math thread 'AI misinformation and Erdos problems' explicitly frames coverage as misleading; YouTube video 'Planar Point Sets with Many Unit Distances' appears; result reaches mainstream social media audiences [19][5][18]

Perspectives

OpenAI

Presents the unit distance disproof as a landmark milestone; has published a formal technical PDF 'Planar Point Sets with Many Unit Distances' and an accompanying YouTube video providing mathematical documentation; promotes GPT-5.4 for science and mathematics achievements including FrontierMath records; provides no methodological caveats in public announcements

Evolution: deepened — the YouTube video [5] extends the formal documentation effort begun with the PDF [4], but neither addresses the community's factual accuracy concerns

[1][42][43][4][5]

Reddit r/math community

A thread titled 'AI misinformation and Erdos problems' [19] explicitly frames coverage as misleading — the sharpest pushback from the mathematical community yet, going beyond the measured skepticism of prior discussion threads

Evolution: hardened — previous r/math engagement was extended discussion; this thread makes an active misinformation accusation, representing a qualitative shift in community tone

[19][44]

Terence Tao (mathematician, Fields Medal)

Has posted on Mathstodon with partial text 'I was able to use an extended…' indicating direct engagement with the approach or a related extension; the full extent of his assessment is not captured in available sources

Evolution: consistent — highest-credential mathematical engagement in the thread, though the publicly available fragment remains partial

[10]

Gil Kalai (mathematician, combinatorics blogger)

Enthusiastically endorses the result via both X and his WordPress blog, calling it 'amazing' and crediting AI directly; his engagement from inside the relevant mathematical community lends the result significant credibility

Evolution: consistent

[8][9][45]

Sebastien Bubeck (ML researcher, former Microsoft Research)

Has engaged on X regarding the unit distance result; the specific content of his assessment is not captured in available sources, but his engagement adds a technically credentialed ML perspective

Evolution: consistent

[13]

Noam Solomon (mathematician)

Has posted on LinkedIn in response to the OpenAI unit distance result; specific stance not captured in available sources

Evolution: consistent

[11]

Michael Harris (mathematician, Silicon Reckoner blogger)

Has published multiple posts specifically addressing the Erdős result, 'artificial intuition,' and the broader AI mathematics conversation, building a sustained critical and epistemological engagement; previously characterized OpenAI's FrontierMath benchmark involvement as a 'scandal'

Evolution: consistent — running commentary continues across multiple posts

[20][21][22][23]

MIRI Berkeley (AI safety research organization)

Amplifying the unit distance result under the framing of autonomous AI mathematical agency — treating the result as significant for AI capability and safety discourse rather than purely as a mathematical milestone

Evolution: consistent

[40][41][46][47]

Epoch AI

Reports GPT-5.4 set a new FrontierMath record at 38% and solved an open FrontierMath problem, adding benchmark evidence for OpenAI's math capability claims while also reopening questions about evaluator independence

Evolution: consistent

[33][26][34][48][49]

Polish mathematician (Dr_Singularity / subject of Quantum Zeitgeist piece)

Spent 20 years designing the hard problems in the FrontierMath benchmark and describes GPT-5.4's solution of an open problem as 'singularity' — the most maximalist framing from any named actor with direct domain expertise over the specific challenge

Evolution: consistent

[30][31][50]

Greg Burnham (Lemmata substack)

Published 'What I Wish I Knew About FrontierMath,' providing analysis of the benchmark's structure and limitations; specific conclusions not captured in available sources

Evolution: consistent

[37]

Robert Talbert (mathematician)

Has observed and commented on the character of the mathematical community's response to the OpenAI result; specific framing not captured in available sources

Evolution: consistent

[12]

Forbes

Frames the unit distance result specifically around mathematician reactions, with a title emphasizing that mathematicians are 'paying attention' — a framing distinct from other mainstream outlets that led with 'biggest breakthrough'

Evolution: consistent

[14]

Scientific American / The Guardian / New Scientist

Characterize the unit distance result as AI's biggest or most significant mathematical breakthrough yet; quote mathematicians expressing amazement

Evolution: consistent

[16][15][17]

Reddit r/MachineLearning / r/slatestarcodex

Both communities hosting skeptical or measured discussion; r/MachineLearning frames the result under the word 'claims,' signaling community-level wariness; r/slatestarcodex discussion reflects rationalist community interest in capability implications

Evolution: consistent

[25][24]

TechCrunch / TechRepublic

Documenting the FrontierMath benchmark controversy: o3 scored lower on independent assessments than OpenAI initially implied, and OpenAI secretly funded the benchmark before achieving records on it

Evolution: consistent

[35][36]

Zvi Mowshowitz

Cautiously impressed; calls this the first AI math result he finds genuinely impressive, embedding it within broader capability and safety commentary

Evolution: consistent

[51]

Rohan Paul (@rohanpaul_ai)

Bullish; reads the result as evidence that test-time compute on a general-purpose LLM is sufficient for research-grade mathematical output without specialized architecture

Evolution: consistent

[6]

Tensions

Misinformation vs. legitimate milestone: The Reddit r/math thread 'AI misinformation and Erdos problems' [19] explicitly accuses coverage of being misleading, directly contesting the enthusiastic endorsements from Gil Kalai [8][9] and the mainstream media characterization of the result as AI's most significant mathematical achievement [15][16] — creating an active factual dispute within the mathematical community. [19][8][9][15][16]
Transparency gap: OpenAI presents the unit distance disproof as a clean milestone [1][42] and has published a formal PDF and YouTube video [4][5], but the broader commentary treats the novel algebraic number theory method as the central puzzle that remains poorly explained publicly [6][7][2]. Whether the PDF and video close this gap remains open. [1][42][4][5][6][7][2]
Autonomy framing: MIRI Berkeley and AI-capability amplifiers emphasize that the model 'autonomously disproved' the conjecture [40][41][47], while William Jin explicitly cautions that 'monumental' does not equal AGI [52] and the full degree of human scaffolding and problem-framing that shaped the result remains undisclosed by OpenAI [1]. [40][41][47][52][1]
Evaluator independence: Michael Harris labeled OpenAI's FrontierMath benchmark involvement a 'scandal' [23], and TechCrunch and TechRepublic have documented benchmark discrepancies in OpenAI's math claims [35][36]. Epoch AI's reports that GPT-5.4 achieved 38% and solved an open benchmark problem [33][26] deepen this tension — they may represent independent validation or another iteration of the same conflict-of-interest pattern. [23][35][36][33][26][32]
What kind of cognition is this? Michael Harris is explicitly exploring 'artificial intuition' and an expanding AI mathematics conversation [21][22], while Rohan Paul argues that test-time compute on a general-purpose LLM suffices for frontier discovery [6] — two framings with very different implications for how AI mathematical results should be credited and built upon. [21][22][6][13]
Verification asymmetry: GPT-5.4's FrontierMath open-problem solution was autoformalized — converted to machine-checkable form [27][28][29] — while the Erdős unit distance disproof, documented in the OpenAI PDF [4] and arXiv preprint [2], still awaits confirmed peer-reviewed journal submission or independent formal verification. This creates a credibility gap between the two AI math milestones despite both being attributed to OpenAI models. [27][28][29][4][2]
Maximalism vs. calibration: The Polish mathematician who designed FrontierMath's hardest problems calls GPT-5.4's solution 'singularity' [30][31] — the most maximalist framing from anyone with direct domain expertise over the specific challenge. This stands in direct tension with William Jin's explicit 'monumental but not AGI' caution [52] and the measured skepticism of technical communities [25][24]. [30][31][52][25][24]

Sources

[1] An OpenAI model has disproved a central conjecture in discrete geometry — OpenAI Blog (2026-05-20)
[2] [2605.20695] Remarks on the disproof of the unit distance conjecture — reactive:openai-erdos-math-breakthrough
[3] [PDF] Remarks on the disproof of the unit distance conjecture - arXiv — reactive:openai-erdos-math-breakthrough
[4] Planar Point Sets with Many Unit Distances — reactive:openai-erdos-math-breakthrough
[5] Planar Point Sets with Many Unit Distances — reactive:openai-erdos-math-breakthrough
[6] A general-purpose LLM can produce frontier research when given enough test-time compute. — Rohan Paul Twitter (2026-05-21)
[7] This is WILD! — Milk Road AI Twitter (2026-05-21)
[8] An internal model of Open AI disproved Erdos unit distance conjecture. — reactive:openai-erdos-math-breakthrough
[9] Unit Distance Problem | Combinatorics and more — reactive:openai-erdos-math-breakthrough
[10] Terence Tao: "I was able to use an extended …" - Mathstodon.xyz — reactive:openai-erdos-math-breakthrough
[11] Noam Solomon's Post — reactive:openai-erdos-math-breakthrough
[12] Very interesting. The response to this by mathematicians ... — reactive:openai-erdos-math-breakthrough
[13] Sebastien Bubeck (@SebastienBubeck) on X — reactive:openai-erdos-math-breakthrough
[14] The AI Breakthrough That Has Mathematicians Paying Attention — reactive:openai-erdos-math-breakthrough
[15] OpenAI makes breakthrough on 80-year-old maths problem — reactive:openai-erdos-math-breakthrough
[16] OpenAI announces AI's biggest math breakthrough yet — reactive:openai-erdos-math-breakthrough
[17] Mathematicians stunned by AI's biggest breakthrough in ... — reactive:ai-formal-math-breakthroughs
[18] An AI just solved a math problem that stumped the world's best ... — reactive:openai-erdos-math-breakthrough
[19] AI misinformation and Erdos problems : r/math - Reddit — reactive:openai-erdos-math-breakthrough
[20] About that Erdős problem - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[21] About "artificial intuition" - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[22] The conversation on AI mathematics is expanding - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[23] The FrontierMath scandal - by Michael Harris — reactive:openai-erdos-math-breakthrough
[24] An OpenAI model has disproved a central conjecture in ... — reactive:openai-erdos-math-breakthrough
[25] OpenAI claims a general-purpose reasoning model found a ... - Reddit — reactive:openai-erdos-math-breakthrough
[26] GPT-5.4 Pro Hits 38% on FrontierMath, Why This Matters? — reactive:openai-erdos-math-breakthrough
[27] GPT-5.4 solves its first open math problem from FrontierMath benchmark — reactive:openai-erdos-math-breakthrough
[28] GPT-5.4 Pro solved the first Open Frontier Math problem. It was then ... — reactive:openai-erdos-math-breakthrough
[29] Process-Driven Autoformalization in Lean 4 - OpenReview — reactive:openai-erdos-math-breakthrough
[30] Tough Math Problem Convinces Mathematician the Singularity Is Here — reactive:openai-erdos-math-breakthrough
[31] A Polish mathematician spent 20 years designing very hard ... — reactive:openai-erdos-math-breakthrough
[32] Alex runs the numbers on GPT-5.5 vs 5.4 and says frontier math is ... — reactive:openai-erdos-math-breakthrough
[33] GPT-5.4 set a new record on FrontierMath, our benchmark of ... — reactive:openai-erdos-math-breakthrough
[34] GPT-5.4 set a new record on FrontierMath - Epoch AI — reactive:openai-erdos-math-breakthrough
[35] OpenAI's o3 AI model scores lower on a benchmark ... - TechCrunch — reactive:openai-erdos-math-breakthrough
[36] OpenAI's o3: AI Benchmark Discrepancy Reveals Gaps in ... — reactive:openai-erdos-math-breakthrough
[37] What I Wish I Knew About FrontierMath - by Greg Burnham — reactive:openai-erdos-math-breakthrough
[38] The AI Revolution in Math Has Arrived | Quanta Magazine — reactive:openai-erdos-math-breakthrough
[39] UC Irvine, USC receive $2.6 million DARPA grant for AI to drive math breakthroughs – UC Irvine News — reactive:openai-erdos-math-breakthrough
[40] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[41] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[42] The Erdős Breakthrough | OpenAI | 166 comments - LinkedIn — reactive:openai-erdos-math-breakthrough
[43] Advancing science and math with GPT-5.2 | OpenAI — reactive:openai-erdos-math-breakthrough
[44] OpenAI's internal model disproves Unit Distance Conjecture of Erdos — reactive:openai-erdos-math-breakthrough
[45] Gil Kalai on the new AI proof of the Erdős unit distance problem — reactive:openai-erdos-math-breakthrough
[46] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[47] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[48] Epoch AI's Post - LinkedIn — reactive:openai-erdos-math-breakthrough
[49] FrontierMath: LLM Benchmark for Advanced AI Math Reasoning | Epoch AI — reactive:openai-erdos-math-breakthrough
[50] GPT-5.4 Just Cracked a 20-Year Math Problem — reactive:openai-erdos-math-breakthrough
[51] AI #169: New Knowledge — Zvi's AI Roundups (2026-05-21)
[52] @OpenAI This feels monumental. A general-purpose reasoning model making a frontier-level math contribution isn’t AGI, bu... — reactive:openai-erdos-math-breakthrough (2026-05-21)
[53] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[54] Po-Shen Loh's Post - LinkedIn — reactive:openai-erdos-math-breakthrough
[55] An OpenAI model has disproved a central conjecture in discrete ... — reactive:openai-erdos-math-breakthrough
[56] An EpochAI Frontier Math open problem may have been solved for ... — reactive:openai-erdos-math-breakthrough