A good chunk of inference for the most successful AI agent, Claude Code, is done on Trainium, while Claude training is d…

SemiAnalysis Twitter · SemiAnalysis (@SemiAnalysis_) · 2026-06-29

SemiAnalysis reports that a significant share of inference for Anthropic's Claude Code AI agent runs on AWS Trainium while Claude model training is performed on Google TPUs, highlighting rapid erosion of NVIDIA's CUDA moat.

Open original ↗

Appears in

NVIDIA Cancels 4-Die Rubin Ultra and Faces Structural Market Share Erosion from Trainium, TPUs, and AMD

Extraction

Topics: ai-inferenceai-hardwarecuda-moattrainiumclaude-code

Claims

A substantial portion of Claude Code inference workloads runs on AWS Trainium chips.
Claude model training is performed on Google TPUs rather than NVIDIA hardware.
One year ago, Trainium and TPUs achieving this level of production AI workload adoption would have been considered unimaginable.
NVIDIA's CUDA moat is slowly eroding as alternative accelerators gain real-world adoption at scale.

Key quotes

A good chunk of inference for the most successful AI agent, Claude Code, is done on Trainium, while Claude training is done on TPUs.

Just a year ago, it would have been unimaginable that TPUs and Trainium could grow this rapidly, while the CUDA moat slowly eroded.