OpenAI Model Disproves 80-Year-Old Erdős Geometry Conjecture · history

Version 4

2026-05-24 10:31 UTC · 158 items

Changes since v3

Terence Tao has surfaced as a significant new voice, posting on Mathstodon about the approach — the highest-credential mathematical engagement yet in the thread, though the publicly available fragment is partial [^13476]. Michael Harris has deepened his engagement from benchmark criticism to two dedicated Silicon Reckoner posts on the Erdős result and 'artificial intuition,' adding sustained epistemological analysis [^13467][^13468]. Epoch AI's report of a GPT-5.4 FrontierMath record [^16608] is a genuinely new development that both extends the capability narrative and sharpens the evaluator independence tension. MIRI Berkeley's amplification with the 'autonomously disproved' framing [^16585][^16589] has added the AI safety community as a distinct named voice, and Reddit r/MachineLearning's skeptical 'OpenAI claims' framing [^16743] adds another named community perspective not present in the previous pass.

What

An OpenAI general-purpose reasoning model disproved the planar unit distance conjecture — a problem open since Paul Erdős posed it in 1946 — by constructing a counterexample, with a formal arXiv preprint providing the mathematical basis [1][2]. The result has now drawn engagement from Terence Tao, widely regarded as the greatest living mathematician, who posted on Mathstodon about the approach [11], and from Michael Harris, who has published two dedicated Silicon Reckoner pieces engaging with the result and with the concept of 'artificial intuition' [12][13]. New Scientist has joined The Guardian and Scientific American in calling it AI's biggest mathematical breakthrough to date [10]. Separately, Epoch AI reports that GPT-5.4 has set a new FrontierMath benchmark record [15], a development that simultaneously extends the AI math capability narrative and sharpens existing questions about evaluator independence.

Why it matters

Tao's engagement signals that the result is being taken seriously at the highest level of professional mathematics, potentially moving the story from 'AI makes headline claim' toward 'mathematical community actively engages with a genuine result.' The concurrent GPT-5.4 FrontierMath record sharpens a structural question about whether OpenAI's math achievement ecosystem — from benchmarks to open problems — is sufficiently insulated from conflicts of interest to be trusted as the field builds on these results.

Open questions

What exactly did Terence Tao say about the result on Mathstodon — does his partial post 'I was able to use an extended…' constitute endorsement, independent verification, or extension of the method? [11]
Michael Harris has published a dedicated piece on 'artificial intuition' alongside his Erdős analysis [13] — is his framing a critique of attributing genuine mathematical understanding to the model, or a more neutral inquiry into what kind of cognition was demonstrated? [12]
Epoch AI reports GPT-5.4 set a new FrontierMath record [15] — was this evaluation independent of OpenAI funding, and does it close or deepen the evaluator independence gap that Harris labeled a 'scandal' in the earlier o3/FrontierMath episode? [14]
Has the arXiv preprint been submitted to a peer-reviewed journal, and are independent mathematicians publicly confirming or challenging the counterexample's validity beyond Kalai's endorsement? [2][3]

Narrative

On May 20, 2026, OpenAI announced that one of its general-purpose reasoning models had disproved the planar unit distance conjecture in discrete geometry, a problem first posed by Paul Erdős in 1946 [1]. The model produced a counterexample rather than a proof, establishing the conjecture is false. Within approximately one day, a preprint titled 'Remarks on the disproof of the unit distance conjecture' appeared on arXiv, providing the first publicly accessible formal documentation of the mathematics [2][3]. The method involved a surprising bridge between algebraic number theory and plane geometry, which multiple commentators treated as more significant than the bare fact of disproving an 80-year-old conjecture [4][5]. Gil Kalai, a prominent combinatorialist, endorsed the result via both X and his WordPress blog, calling it 'amazing' [6][7], and mainstream press including Scientific American, The Guardian, and New Scientist characterized it as AI's biggest mathematical breakthrough to date [8][9][10].

The result has since drawn engagement at the highest level of professional mathematics. Terence Tao — widely regarded as among the greatest living mathematicians — posted on Mathstodon with partial text beginning 'I was able to use an extended…', indicating direct engagement with the approach or a related extension [11]. His engagement, however incomplete its publicly available record, carries significant weight for the result's standing in the professional mathematical community. Michael Harris — Columbia mathematician and author of the Silicon Reckoner substack — has published two dedicated posts: 'About that Erdős problem' and 'About artificial intuition,' the latter exploring what kind of cognition the model demonstrated [12][13]. Harris had previously characterized OpenAI's FrontierMath benchmark involvement as a 'scandal' [14], and his new posts represent a sustained critical and epistemological engagement with this specific result rather than a passing remark.

A parallel development adds new texture to the evaluator independence question. Epoch AI reports that GPT-5.4 has set a new record on FrontierMath [15], and OpenAI has separately promoted GPT-5.2 for science and mathematics [16]. These disclosures indicate OpenAI is actively pursuing multiple math-capable model lines simultaneously, which compounds existing concerns about the independence of benchmark performance claims from model development. The AI safety research organization MIRI Berkeley amplified the unit distance result across social media under the framing 'An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry' [17][18], treating autonomous AI mathematical agency as the salient feature and drawing dozens of retweets from AI safety and capability communities [19][20][21]. The Reddit r/MachineLearning community hosted substantive discussion under the skeptical framing 'OpenAI claims a general-purpose reasoning model found a…' — the word 'claims' reflecting a persistent wariness in technical circles [22].

The result arrives within a broader wave of AI progress in mathematics. Quanta Magazine published 'The AI Revolution in Math Has Arrived' in April 2026 [23], and UC Irvine and USC received a $2.6 million DARPA grant for AI-driven mathematics on May 18, 2026 [24]. Applied technical communities beyond academia have also engaged: the Galois security company has written on o3, FrontierMath, and the future of mathematics [25]. Against this backdrop, OpenAI's unit distance result functions both as a singular claimed milestone and as part of a trend — making the question of independent peer review particularly consequential for establishing lasting credibility.

Timeline

1946: Paul Erdős first poses the planar unit distance conjecture in discrete geometry [5]
2026-04: Quanta Magazine publishes 'The AI Revolution in Math Has Arrived,' establishing AI mathematical capabilities as a recognized trend [23]
2026-05-18: UC Irvine and USC announced a $2.6 million DARPA grant for AI-driven mathematics breakthroughs [24]
2026-05-20: OpenAI announces a general-purpose reasoning model has disproved the unit distance conjecture via counterexample [1]
2026-05-21: ArXiv preprint 'Remarks on the disproof of the unit distance conjecture' appears; Gil Kalai endorses the result via X and his blog; Alex Dimakis's commentary is widely retweeted; The Guardian and Scientific American publish mainstream coverage [2][3][6][7][28][8][9]
2026-05-21: Po-Shen Loh, Zvi Mowshowitz, and other named commentators weigh in; Reddit r/math and Hacker News host extended discussion [37][36][38][39]
2026-05-22: Michael Harris publishes Silicon Reckoner posts on the Erdős result and 'artificial intuition'; New Scientist joins mainstream coverage; MIRI Berkeley amplifies result as 'autonomously disproved'; Epoch AI reports GPT-5.4 set a new FrontierMath record; Michael Harris's FrontierMath scandal framing and TechCrunch/TechRepublic benchmark discrepancy reporting surface as credibility context [12][13][10][17][15][14][33][34]
2026-05-23: Continued broad social media amplification via MIRI Berkeley retweets; Reddit r/MachineLearning hosts skeptical 'OpenAI claims' discussion; Terence Tao posts on Mathstodon engaging with the approach [22][11][18][19][20]
2026-05-24: Further amplification across X, Instagram, TikTok, and LinkedIn; OpenAI publishes 'The Erdős Breakthrough' on LinkedIn [26][40][41]

Perspectives

OpenAI

Presents the disproof as a landmark milestone in AI-driven mathematics; the LinkedIn post titles it 'The Erdős Breakthrough'; provides no methodological caveats or detail in public announcements

Evolution: consistent

[1][26]

Terence Tao (mathematician, Fields Medal)

Has posted on Mathstodon with partial text 'I was able to use an extended…' indicating direct engagement with the approach or a related extension; the full extent of his assessment is not captured in available sources

Evolution: new voice this pass — highest-credential mathematical engagement yet surfaced in the thread

[11]

Gil Kalai (mathematician, combinatorics blogger)

Enthusiastically endorses the result via both X and his WordPress blog, calling it 'amazing' and crediting AI directly; his engagement from inside the relevant mathematical community lends the result significant credibility

Evolution: consistent

[6][7][27]

Michael Harris (mathematician, Silicon Reckoner blogger)

Has published two posts specifically addressing the Erdős result and 'artificial intuition,' building a sustained critical and analytical engagement; previously characterized OpenAI's FrontierMath benchmark involvement as a 'scandal'

Evolution: deepened — moved from general benchmark criticism to direct epistemological engagement with this specific result and what kind of cognition it demonstrates

[12][13][14]

MIRI Berkeley (AI safety research organization)

Amplifying the result under the framing 'An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry' — treating autonomous AI mathematical agency as the salient feature for AI capability and safety discourse

Evolution: new voice this pass

[17][18][19][20]

Epoch AI

Reports GPT-5.4 set a new FrontierMath record, adding benchmark evidence for OpenAI's math capability claims while also reopening questions about evaluator independence

Evolution: new voice this pass

[15]

Alex Dimakis (ML/information theory researcher)

Frames the result as a breakthrough in combinatorics; his original post became the most widely retweeted expert commentary on the result

Evolution: consistent

[28][29][30][31][32]

Scientific American / The Guardian / New Scientist

Characterize the result as AI's biggest or most significant mathematical breakthrough yet; quote mathematicians expressing amazement

Evolution: New Scientist added this pass, deepening mainstream press consensus across three major science outlets

[9][8][10]

Reddit r/MachineLearning

Hosts discussion under the skeptical framing 'OpenAI claims a general-purpose reasoning model found a…' — the word 'claims' signals community-level wariness about OpenAI's self-reported math achievements

Evolution: new voice this pass

[22]

TechCrunch / TechRepublic

Documenting the FrontierMath benchmark controversy: o3 scored lower on independent assessments than OpenAI initially implied, and OpenAI secretly funded the benchmark before achieving records on it

Evolution: consistent

[33][34]

William Jin (@WilliamJin06)

Measured enthusiasm: calls the result 'monumental' but explicitly distinguishes it from AGI, signaling a preference for calibrated rather than maximalist interpretation

Evolution: consistent

[35]

Zvi Mowshowitz

Cautiously impressed; calls this the first AI math result he finds genuinely impressive, embedding it within broader capability and safety commentary

Evolution: consistent

[36]

Rohan Paul (@rohanpaul_ai)

Bullish; reads the result as evidence that test-time compute on a general-purpose model is sufficient for research-grade mathematical output without specialized architecture

Evolution: consistent

[4]

Tensions

Transparency gap: OpenAI presents the result as a clean milestone framed as 'The Erdős Breakthrough' [1][26], while the broader commentary — from Milk Road AI to Alex Dimakis to the arXiv preprint — treats the novel algebraic number theory method as the central puzzle that remains poorly explained in the public announcement [4][5][2]. [1][26][4][5][2]
Autonomy framing: MIRI Berkeley and AI-capability boosters emphasize that the model 'autonomously disproved' the conjecture [17][18], while William Jin explicitly cautions that 'monumental' does not equal AGI [35] — a debate about what degree of human scaffolding and problem-framing shaped the result that OpenAI's sparse announcement does not resolve [1]. [17][18][35][1]
Evaluator independence: Michael Harris has explicitly labeled OpenAI's FrontierMath benchmark involvement a 'scandal' [14], and TechCrunch and TechRepublic have independently documented benchmark discrepancies in OpenAI's math claims [33][34]. Epoch AI's report that GPT-5.4 set a new FrontierMath record [15] deepens this tension — it may represent independent validation or another iteration of the same conflict-of-interest pattern, and the distinction is not yet clear from available sources. [14][33][34][15]
What kind of cognition is this? Michael Harris is explicitly exploring 'artificial intuition' as a frame for what the model demonstrated [13], while Rohan Paul and others argue that test-time compute on a general-purpose LLM is sufficient for frontier discovery [4] — two framings that carry very different implications for how the result should be understood, credited, and built upon. [13][4]

Sources

[1] An OpenAI model has disproved a central conjecture in discrete geometry — OpenAI Blog (2026-05-20)
[2] [2605.20695] Remarks on the disproof of the unit distance conjecture — reactive:openai-erdos-math-breakthrough
[3] [PDF] Remarks on the disproof of the unit distance conjecture - arXiv — reactive:openai-erdos-math-breakthrough
[4] A general-purpose LLM can produce frontier research when given enough test-time compute. — Rohan Paul Twitter (2026-05-21)
[5] This is WILD! — Milk Road AI Twitter (2026-05-21)
[6] An internal model of Open AI disproved Erdos unit distance conjecture. — reactive:openai-erdos-math-breakthrough
[7] Unit Distance Problem | Combinatorics and more — reactive:openai-erdos-math-breakthrough
[8] OpenAI makes breakthrough on 80-year-old maths problem — reactive:openai-erdos-math-breakthrough
[9] OpenAI announces AI's biggest math breakthrough yet — reactive:openai-erdos-math-breakthrough
[10] Mathematicians stunned by AI's biggest breakthrough in ... — reactive:ai-formal-math-breakthroughs
[11] Terence Tao: "I was able to use an extended …" - Mathstodon.xyz — reactive:openai-erdos-math-breakthrough
[12] About that Erdős problem - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[13] About "artificial intuition" - by Michael Harris - Silicon Reckoner — reactive:openai-erdos-math-breakthrough
[14] The FrontierMath scandal - by Michael Harris — reactive:openai-erdos-math-breakthrough
[15] GPT-5.4 set a new record on FrontierMath - Epoch AI — reactive:openai-erdos-math-breakthrough
[16] Advancing science and math with GPT-5.2 | OpenAI — reactive:openai-erdos-math-breakthrough
[17] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[18] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[19] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[20] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[21] RT @MIRIBerkeley: An internal model at OpenAI has autonomously disproved a central conjecture in discrete geometry, a ma... — reactive:openai-erdos-math-breakthrough (2026-05-23)
[22] OpenAI claims a general-purpose reasoning model found a ... - Reddit — reactive:openai-erdos-math-breakthrough
[23] The AI Revolution in Math Has Arrived | Quanta Magazine — reactive:openai-erdos-math-breakthrough
[24] UC Irvine, USC receive $2.6 million DARPA grant for AI to drive math breakthroughs – UC Irvine News — reactive:openai-erdos-math-breakthrough
[25] Galois - o3, Frontier Math, and the Future of Mathematics — reactive:openai-erdos-math-breakthrough
[26] The Erdős Breakthrough | OpenAI | 166 comments - LinkedIn — reactive:openai-erdos-math-breakthrough
[27] Gil Kalai on the new AI proof of the Erdős unit distance problem — reactive:openai-erdos-math-breakthrough
[28] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[29] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[30] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[31] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-22)
[32] RT @AlexGDimakis: A breakthrough by OpenAI in a very famous Combinatorics problem, the Planar Unit Distance problem by E... — reactive:openai-erdos-math-breakthrough (2026-05-21)
[33] OpenAI's o3 AI model scores lower on a benchmark ... - TechCrunch — reactive:openai-erdos-math-breakthrough
[34] OpenAI's o3: AI Benchmark Discrepancy Reveals Gaps in ... — reactive:openai-erdos-math-breakthrough
[35] @OpenAI This feels monumental. A general-purpose reasoning model making a frontier-level math contribution isn’t AGI, bu... — reactive:openai-erdos-math-breakthrough (2026-05-21)
[36] AI #169: New Knowledge — Zvi's AI Roundups (2026-05-21)
[37] Po-Shen Loh's Post - LinkedIn — reactive:openai-erdos-math-breakthrough
[38] OpenAI's internal model disproves Unit Distance Conjecture of Erdos — reactive:openai-erdos-math-breakthrough
[39] An OpenAI model has disproved a central conjecture in discrete ... — reactive:openai-erdos-math-breakthrough
[40] 🚨 BREAKING: OpenAI just made history. One of their internal reasoning models autonomously disproved the 80-year-old Erdő... — reactive:openai-erdos-math-breakthrough (2026-05-24)
[41] great article putting tgthr — reactive:openai-erdos-math-breakthrough (2026-05-24)