NVIDIA Cancels 4-Die Rubin Ultra and Faces Structural Market Share Erosion from Trainium, TPUs, and AMD
What
NVIDIA cancelled the original 4-die Rubin Ultra GPU just three months after announcing it at GTC 2026, citing manufacturing execution concerns; the replacement product carries the same name but delivers roughly half the performance of the original design [1]. Separately, SemiAnalysis reports that Anthropic already runs a substantial share of Claude Code inference on AWS Trainium and trains Claude on Google TPUs rather than NVIDIA hardware [4]. This is consistent with a major Anthropic-Amazon deal announced in April 2026, committing over $100 billion to AWS technologies including Trainium generations 2 through 4 [5]. SemiAnalysis frames both developments as concrete evidence that NVIDIA's CUDA moat is eroding in practice, compounded now by a roadmap setback [6].
Why it matters
The Rubin Ultra cancellation is the first concrete sign that NVIDIA's next-generation roadmap has slipped, and it arrives at a moment when a leading frontier AI lab has already moved substantial production workloads to non-NVIDIA silicon. If manufacturing difficulties continue, the window for alternatives to expand their foothold widens before NVIDIA can reassert hardware leadership.
Open questions
How significant is the performance gap in practice? Half the announced performance of Rubin Ultra could still be competitive depending on pricing, availability, and the workloads buyers actually need [1].
Will Anthropic's Trainium and TPU adoption encourage other frontier labs to diversify away from NVIDIA, or is this specific to Anthropic's deep AWS relationship and cost structure [4][5]?
What are the downstream effects on HBM memory suppliers from NVIDIA cancelling planned future rack configurations [2]?
Does Trainium achieve genuine performance parity with NVIDIA for Claude Code inference, or is Anthropic accepting a hardware trade-off for supply or cost reasons — given that at least one analyst describes Trainium as a strategic failure [7][4]?
Narrative
Three months after NVIDIA announced the 4-die Rubin Ultra at GTC 2026, the company cancelled the original design due to manufacturing execution concerns [1]. A replacement product retains the 'Rubin Ultra' name but is approximately half the size and delivers roughly half the real-world performance of what was announced [1]. SemiAnalysis, which reported the cancellation on June 29, 2026, also noted that NVIDIA has cancelled some planned future rack configurations, with downstream implications for HBM memory demand quantified in a separate research report [2]. Tom's Hardware separately reported that Rubin CPX accelerators were also removed from NVIDIA's roadmap [3].
Concurrent with these hardware setbacks, SemiAnalysis reports that Anthropic — the operator of Claude Code, which it describes as the most successful AI agent — runs a substantial share of Claude Code inference on AWS Trainium chips and performs Claude model training on Google TPUs, not NVIDIA hardware [4]. SemiAnalysis characterized this level of Trainium and TPU adoption as something that 'would have been unimaginable' just a year ago [4]. The infrastructure choices align with a large partnership Anthropic announced with Amazon in April 2026: Anthropic committed more than $100 billion over ten years to AWS technologies spanning Graviton and Trainium generations 2 through 4, with nearly 1 GW of Trainium2 and Trainium3 capacity expected online by end of 2026 [5]. Amazon simultaneously deepened its financial stake, investing an additional $5 billion immediately with up to $20 billion more committed on top of a prior $8 billion [5]. Anthropic's annualized run-rate revenue exceeded $30 billion at announcement, up from approximately $9 billion at end of 2025 [5].
SemiAnalysis ties these threads together with an explicit competitive argument: NVIDIA's market share is being eroded by AWS Trainium, Google TPUs, and AMD chips, and manufacturing execution failures will only accelerate those losses [6]. The counterpoint comes from a Substack analysis titled 'Amazon Trainium Is A Disaster; Strategy Reset Needed,' which argues Trainium has not delivered on its promise [7] — a direct challenge to the bullish framing from SemiAnalysis and from Anthropic's own public statements about its AWS commitment. Neither view has been independently validated with performance benchmarks in this set of sources.
The full picture remains incomplete. The Anthropic-Amazon announcement does not specify what fraction of Claude workloads run on Trainium versus other hardware, and Anthropic's simultaneous multi-cloud positioning — Claude is available on AWS Bedrock, Google Cloud Vertex AI, and Microsoft Azure Foundry [5] — implies some inference runs on NVIDIA-backed infrastructure on other clouds. The scope of customer impact from the Rubin Ultra cancellation and any knock-on effects on data center build-out plans are similarly not yet public beyond early media and analyst coverage.
Timeline
- 2026-03-01: NVIDIA announces the 4-die Rubin Ultra GPU at GTC 2026 (approximate date, ~3 months before the June 29 cancellation report). [1]
- 2026-04-20: Anthropic and Amazon announce a deal securing up to 5 GW of compute capacity; Anthropic commits $100B+ over 10 years to AWS technologies including Trainium2 through Trainium4. [5]
- 2026-04-20: Amazon announces an additional $5 billion investment in Anthropic, with up to $20 billion more committed, on top of a prior $8 billion stake. [5]
- 2026-04-20: Anthropic reports annualized run-rate revenue exceeding $30 billion, up from approximately $9 billion at end of 2025. [5]
- 2026-06-29: SemiAnalysis reports NVIDIA cancelled the original 4-die Rubin Ultra due to manufacturing execution concerns; replacement product is roughly half the size and performance. [1]
- 2026-06-29: SemiAnalysis reports Claude Code inference runs on AWS Trainium and Claude training on Google TPUs, framing this as evidence of NVIDIA's CUDA moat eroding ahead of expectations. [4]
- 2026-06-29: SemiAnalysis reports NVIDIA also cancelled some planned future rack configurations, with material implications for HBM memory demand. [2]
- 2026-06-29: SemiAnalysis argues NVIDIA's market share is being eroded by Trainium, TPUs, and AMD, and that manufacturing failures will accelerate further losses. [6]
Perspectives
SemiAnalysis
NVIDIA's cancellation of the 4-die Rubin Ultra is a manufacturing execution failure that compounds ongoing market share erosion from Trainium, TPUs, and AMD; Claude Code running on Trainium and TPUs is concrete evidence the CUDA moat is eroding faster than expected.
Evolution: Consistent analytical position critical of NVIDIA execution; newly supported by Rubin Ultra cancellation as a specific data point.
Anthropic
The Amazon deal is framed as an infrastructure response to demand that outpaced capacity, with Trainium central to future Claude workloads; Anthropic emphasizes multi-cloud availability as a competitive differentiator.
Evolution: No prior stance in this thread; announcement is the first public articulation.
Amazon (AWS)
Deepening financial and infrastructure commitment to Anthropic, positioning Trainium as a credible alternative to NVIDIA for frontier AI workloads.
Evolution: Consistent with prior investments; this deal substantially expands financial exposure.
Enertuition (Substack analyst)
Trainium is a strategic failure that requires a reset, directly contesting the bullish narrative around Amazon's custom silicon.
Evolution: No prior stance in this thread; introduced as a dissenting voice.
Tensions
- SemiAnalysis and Anthropic argue Trainium has achieved real production-scale adoption at a frontier lab; Enertuition argues Trainium is a strategic disaster requiring a reset. [4][5][7]
- SemiAnalysis frames the Rubin Ultra cancellation as a significant manufacturing execution failure that will accelerate NVIDIA's competitive losses; NVIDIA has not publicly characterized the change or its scope. [1][6]
- SemiAnalysis argues NVIDIA's CUDA moat is slowly but structurally eroding; the extent to which Trainium and TPU adoption is specific to Anthropic's cost and supply situation versus a broader industry shift is unresolved. [4][6]
Status: active and growing
Sources
- [1] INTERESTING: Only 3 months after Rubin Ultra was announced at GTC 2026, the original 4-die Rubin Ultra has been cancelle… — SemiAnalysis Twitter (2026-06-29)
- [2] Furthermore, check out our latest accelerator model update, which talks more about the HBM memory implications of these … — SemiAnalysis Twitter (2026-06-29)
- [3] Nvidia removes Rubin CPX accelerators from its roadmap — Groq 3 LPUs take center stage as CPX is removed | Tom's Hardware — reactive:nvidia-rubin-execution-failure
- [4] A good chunk of inference for the most successful AI agent, Claude Code, is done on Trainium, while Claude training is d… — SemiAnalysis Twitter (2026-06-29)
- [5] Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute — Anthropic News (2026-04-20)
- [6] This all comes against the backdrop of NVIDIA’s market share being eroded by Trainium, TPUs, and AMD chips. For NVIDIA t… — SemiAnalysis Twitter (2026-06-29)
- [7] Amazon Trainium Is A Disaster; Strategy Reset Needed — reactive:nvidia-rubin-execution-failure