AWS CEO: AI Compute Demand So Strong No A100 Server Has Ever Been Retired

closed · v24 · 2026-05-27 · 579 items · history

What's new in v24

The new items this pass primarily deepen two existing narratives without introducing new fault lines. Multiple industry analyses — SemiWiki, FusionWW, and a TokenRing/FinancialContent piece — report TSMC is working to double its CoWoS packaging capacity[22][23][24], adding a potential relief trajectory to the previously-named CoWoS-L Rubin Ultra bottleneck[21] while leaving the timing alignment with Rubin Ultra's schedule unresolved; this is captured as a new tension replacing the bubble-debate entry. DigiTimes and a Korea Herald post independently corroborate the Nvidia 16-high HBM request by Q4 2026[16][17], elevating that escalation from a single prior source to multi-outlet consensus.

What

AWS CEO Matt Garman's April 2026 statement that no A100 server has ever been retired—with AWS completely sold out—anchors a structural AI infrastructure shortage spanning GPUs, CPUs, and memory simultaneously[1][2]. All three major HBM suppliers are in active HBM4 production[14], but Samsung's entire 2026 HBM allocation is already sold out[13] and Nvidia has escalated demand by requesting 16-high HBM stacks from suppliers by Q4 2026[16][18], signaling another specification cycle before HBM4 has fully ramped. Rubin Ultra faces a CoWoS-L advanced packaging bottleneck at TSMC[21], while multiple industry analyses report TSMC is working to double its overall CoWoS capacity[22][23]—a planned expansion whose timing relative to Rubin Ultra's schedule remains unresolved.

Why it matters

Every successive layer of the shortage—GPU, CPU, and memory—has seen demand outrun each fix: A100s never retired, HBM4 spec escalated before HBM3e constraints resolved, and TSMC's CoWoS-L packaging capacity named as the binding limit on next-generation GPU delivery. Whether TSMC's CoWoS capacity doubling[22] relieves the Rubin Ultra constraint in time is now the most consequential open question for 2026–2027 AI hardware supply.

Open questions

TSMC is reportedly working to double overall CoWoS capacity[22][23][24], but does the expansion specifically relieve the CoWoS-L bottleneck constraining Rubin Ultra[21] on a timeline that matters for that platform's competitive window?
Nvidia's 16-high HBM request targets Q4 2026[16][17] — does this trigger a redesign cycle at all three suppliers analogous to the Rubin spec changes that drove 2026's HBM4 delays, or can suppliers adapt within current production architectures?
With all three suppliers in active HBM4 production[14][36], does the contested Samsung-Micron #2 HBM ranking[27][28] now turn primarily on Nvidia qualification milestones and allocation decisions rather than production readiness?
If Rubin Ultra's CoWoS-L delay[21] extends into 2027, does base Rubin[20] already lock in enough of Nvidia's next-gen installed base to limit AMD MI500's 2H 2027 competitive window[25], or does the gap create a real switching opportunity?

Narrative

On April 26, 2026, AWS CEO Matt Garman stated publicly that AWS has never retired a single Nvidia A100 server and is completely sold out, characterizing AI compute demand as 'almost insatiable'[1][2]. That claim sits atop a confirmed multi-layer shortage: Andy Jassy's April 2026 shareholder letter reported AWS's chips business at a $20B+ annual revenue run rate, with Trainium2 largely sold out, Trainium3 nearly fully subscribed, and Trainium4 capacity reserved 18 months ahead[3][4]. GPU rental prices have moved sharply across generations — SemiAnalysis documented a nearly 40% H100 one-year rental price surge[5] and Nvidia Blackwell GPU rental reached $4.08/hr, a 48% surge[6]. Intel and AMD both executed approximately 15% server CPU price increases — Intel's third round[7][8] — while AMD CEO Lisa Su reported server CPU demand 'far exceeded expectations, supply now tightening'[9].

The memory supply chain sits at the most dynamic point of the shortage. TrendForce framed 2026 as 'HBM4 Delays and HBM3e Dominance'[10], and both Samsung and SK Hynix issued public warnings that AI-driven shortages could persist until 2027 and beyond[11][12]. Samsung's entire 2026 HBM allocation sold out after beginning Nvidia shipments in Q3 2025[13]. At Computex 2026, Micron announced high-volume production of HBM4 designed specifically for Nvidia Vera Rubin[14], placing all three major suppliers in active HBM4 production and resolving the previously-flagged risk that Micron's delay would cede competitive ground[15]. Before HBM4 has fully ramped, Nvidia has already escalated: DigiTimes and Korea Herald report Nvidia is requesting 16-high HBM stacks — beyond the current 12-high standard — from suppliers by Q4 2026[16][17][18], a specification jump that could trigger new redesign cycles at all three suppliers.

A Wedbush January 2026 analysis formalized the causal chain from Nvidia's Rubin architecture spec changes to HBM4 redesign delays at Samsung and SK Hynix[19]. The Rubin story has since bifurcated: base Rubin entered full production at CES 2026[20], while Rubin Ultra faces a more specific constraint — CoWoS-L advanced packaging issues at TSMC[21]. CoWoS-L is the large-format packaging technology required for Rubin Ultra's expanded chiplet configuration; TSMC's CoWoS-L capacity appears to be the binding bottleneck for Rubin Ultra's schedule. Multiple industry analyses now report that TSMC is working to double its overall CoWoS capacity to address AI GPU demand[22][23][24], though whether this expansion resolves the CoWoS-L constraint specifically — and whether the timing relieves the Rubin Ultra bottleneck or arrives after its competitive window — remains open. AMD confirmed Instinct MI500 on 2nm CDNA 6 for 2H 2027[25][26], with competitive relevance directly tied to how long Rubin Ultra's delays extend.

The market share contest between Samsung and Micron for the #2 HBM position remains contested: one analyst placed Micron ahead[27], a subsequent report indicates Samsung reclaimed second place[28], and Micron's Computex high-volume HBM4 announcement[14] may shift the picture again. SK Hynix holds approximately 62% of the HBM market[27]. Intel's stock reached an all-time high of $95.73 — the highest forward P/E of any large-cap chip stock[29][30] — but AMD's record CPU market share gains, attributed by multiple researchers primarily to Intel supply constraints[31][32][33], represent an unacknowledged competitive headwind. Micron's Q2 2026 record revenues[34] and 120% year-to-date stock gain[35] reflect the broad financial upside for memory suppliers positioned across both HBM3e and HBM4 ramps.

Timeline

2025-10-30: Samsung's entire 2026 HBM supply allocation sells out after beginning Nvidia HBM shipments in Q3 2025 — the earliest documented HBM demand saturation anchor in the thread. [13]
2026-01-01: TrendForce publishes 2026 HBM Outlook framed as 'HBM4 Delays and HBM3e Dominance'; multiple industry analyses report TSMC working to double CoWoS packaging capacity through 2026 to relieve AI GPU demand. [10][22][23][24]
2026-01-05: AWS raises EC2 Capacity Block prices 15% in a uniform ML pricing adjustment. [44][45]
2026-01-07: AMD confirms Instinct MI500 for 2H 2027 on 2nm CDNA 6 and MI400 for 2026 at CES; The Register unpacks AMD's full datacenter GPU roadmap. [25][26][46]
2026-01-15: Wedbush publishes analysis formalizing the Nvidia Rubin architecture-to-HBM4-delay causal chain at Samsung and SK Hynix. [19]
2026-02-12: Samsung claims commercial HBM4 shipments to Nvidia began and is near a deal to supply over 30% of Nvidia's HBM4 needs. [36][47]
2026-04-09: Jassy's 2025 shareholder letter: AWS chips at $20B+ annual run rate; Trainium2 largely sold out, Trainium3 nearly fully subscribed, Trainium4 reserved 18 months ahead; Amazon considering external Trainium sales at potentially $50B. [3][4][39]
2026-04-24: Intel Q1 FY2026 earnings confirm agentic AI demand driving CPU revenue; Q2 guidance raised; Intel exec warns CPU shortage hitting 'everyone.' [40][41][48]
2026-04-25: Intel stock surges to all-time high of $95.73, the highest forward P/E of any large-cap chip stock. [29][30]
2026-04-26: AWS CEO Matt Garman publicly states AWS has never retired a single Nvidia A100 server and is completely sold out, citing demand as 'almost insatiable.' [1][2]
2026-04-27: SemiAnalysis H100 one-year rental price index documents nearly 40% surge over six months; Azure confirms older VM retirements while A100-based instances remain active. [5][49]
2026-04-30: Amazon tripled its CPU server count and still ran out of capacity; Meta signs multibillion-dollar deal to deploy tens of millions of AWS Graviton5 cores for agentic AI. [38][50]
2026-05-02: AMD Q1 2026 earnings beat; Lisa Su states server CPU demand 'far exceeded expectations, supply now tightening'; Intel and AMD confirm approximately 15% server CPU price increases — Intel's third round. [9][7][8]
2026-05-03: Samsung and SK Hynix issue public warnings that AI-driven memory shortages could persist until 2027 and beyond; SupplyFrame Intelligence extends the outlook to 2030. [11][12][51]
2026-05-20: Computex 2026: Micron announces high-volume HBM4 production for Nvidia Vera Rubin; Nvidia requests 16-high HBM stacks from suppliers by Q4 2026; Blackwell GPU rental reaches $4.08/hr (+48%); SK Hynix confirmed at 62% HBM market share. [14][18][16][6][27]
2026-05-25: Rubin Ultra identified as specifically facing CoWoS-L advanced packaging bottleneck at TSMC; Samsung reportedly reclaims #2 HBM share from Micron; Micron Q2 2026 record revenues, stock up 120% YTD. [21][28][34][35]

Perspectives

Matt Garman, CEO of AWS

AI compute demand structurally exceeds supply across all GPU generations; AWS completely sold out of A100 capacity, never retired one; demand 'almost insatiable'; personally endorsed Meta's Graviton5 deployment.

Evolution: Consistent. Corroborated by Samsung/SK Hynix shortage warnings, confirmed CPU price increases, Samsung's sold-out 2026 HBM allocation, AMD record CPU share gains, and Blackwell GPU rental above $4/hr.

[1][2][37][38]

Andy Jassy, CEO of Amazon

AWS chips at $20B+ annual revenue run rate; Trainium2 largely sold out, Trainium3 nearly fully subscribed, Trainium4 reserved 18 months ahead; considering external Trainium sales at potentially $50B.

Evolution: Consistent. External Trainium sales plan confirmed by financial press. An MSN '$15B AI revenue' vs. Jassy '$20B+ run rate' discrepancy remains unreconciled.

[3][4][39]

Nvidia

Blackwell GPU rental at $4.08/hr (+48%); base Rubin in production; Rubin Ultra faces CoWoS-L advanced packaging issues at TSMC specifically; requesting 16-high HBM stacks from memory suppliers by Q4 2026.

Evolution: TSMC's reported CoWoS capacity doubling[22][23] adds a potential relief trajectory to the Rubin Ultra bottleneck, but the timing alignment with Rubin Ultra's competitive schedule remains open and unaddressed by Nvidia directly.

[6][20][21][18][16][17]

AMD

Q1 2026 earnings beat; server CPU demand 'far exceeded expectations, supply now tightening'; record CPU market share gains attributed primarily to Intel supply constraints; MI500 confirmed for 2H 2027 on 2nm CDNA 6, MI400 for 2026.

Evolution: Consistent. MI500's 2H 2027 window becomes more or less competitive depending on Rubin Ultra's CoWoS-L delay duration — a variable not addressed by AMD directly.

[9][31][32][33][25][26]

Intel

Q1 FY2026 earnings confirm agentic AI demand driving CPU revenue; stock at all-time high $95.73, highest forward P/E of any large-cap chip stock; three rounds of server CPU price increases totaling approximately 15%; production shifted to Xeon.

Evolution: Consistent. AMD's record CPU market share gains attributed to Intel supply constraints remain an unacknowledged competitive headwind in Intel's public communications.

[40][41][29][30][7][8]

Samsung and SK Hynix

Both warned publicly that AI memory shortages could persist until 2027 and beyond; Samsung's 2026 HBM sold out, HBM4 shipments since February 2026, near a deal for 30%+ of Nvidia's HBM4; SK Hynix holds 62% of the HBM market.

Evolution: Consistent on shortage warnings and sell-out status. The contested Samsung-Micron #2 ranking has a new complicating factor in Micron's Computex HV HBM4 production announcement, even as Samsung reportedly reclaimed second place.

[11][12][13][36][27][28]

Micron

High-volume production of HBM4 for Nvidia Vera Rubin announced at Computex 2026; Q2 2026 record revenues; stock up 120% year-to-date; previously-documented HBM4 delay risk resolved.

Evolution: Significant competitive repositioning: the Computex HV HBM4 announcement moves Micron from potential laggard to active three-way HBM4 competitor, reopening the #2 market share ranking question.

[14][34][35][15][28]

Market research and trade press (TrendForce, SemiAnalysis, Wedbush, WCCFtech, DigiTimes, Korea Herald, SemiWiki, FusionWW)

CPU shortage labeled a 'supercycle'; TrendForce: HBM4 Delays and HBM3e Dominance; Wedbush: Rubin causes HBM4 redesigns; CoWoS-L named as Rubin Ultra's specific bottleneck; TSMC reportedly doubling CoWoS capacity; Nvidia 16-high HBM request corroborated by DigiTimes and Korea Herald.

Evolution: The TSMC CoWoS capacity doubling reporting[22][23] adds a potential relief narrative to the packaging bottleneck story, while DigiTimes and Korea Herald corroboration[16][17] elevates the 16-high HBM escalation from a single-source claim to multi-source consensus.

[10][5][19][21][22][23][24][16][17]

Tensions

Samsung vs. Micron for the #2 HBM market share position: contested across three signals — one analyst placed Micron ahead[27], a report indicates Samsung reclaimed second[28], and Micron's Computex HV HBM4 production[14] may shift the picture again. [27][28][14]
CoWoS capacity expansion vs. Rubin Ultra bottleneck timing: TSMC is reportedly doubling overall CoWoS capacity[22][23], but whether that expansion resolves the CoWoS-L constraint for Rubin Ultra specifically — and arrives in time to matter for that platform — remains open[21]. [22][23][24][21]
Base Rubin vs. Rubin Ultra: base Rubin in full production since CES 2026[20], while Rubin Ultra faces a specific CoWoS-L bottleneck at TSMC[21] — a two-track timeline within the same architecture family with distinct competitive implications for AMD and memory suppliers. [20][21][19]
AMD MI500 timing: TechRadar asks whether MI500 arrives 'too late' given Nvidia's Vera Rubin 2026 target[42], while Rubin Ultra's CoWoS-L delay[21] could make AMD's 2H 2027 window more competitive — with base Rubin's head start as the countervailing factor[20]. [42][21][25][20]
Is AMD's record CPU market share gain supply-driven (temporary) or architectural preference (durable)? Multiple researchers attribute AMD's gains primarily to Intel supply constraints[31][32][33], but AMD's own supply tightening[9] complicates indefinite share-taking. [31][32][33][9]
Intel's highest forward P/E of any large-cap chip stock[30] versus AMD's record market share gains attributed to Intel supply failures[43]: Intel's premium valuation prices in a recovery that AMD market share data suggests may not fully materialize. [30][29][43][31]

Status: active and growing

Sources

[1] AWS CEO Matt Garman: "Because there is so much more demand than supply, there typically still is demand for the older ch… — Rohan Paul Twitter (2026-04-26)
[2] Matt Garman, CEO of AWS, Amazon's $100+ billion cloud division and what he just said is the single most important data p… — Milk Road AI Twitter (2026-04-26)
[3] Amazon CEO reveals AI revenue, dismisses spending doubts in ... — reactive:aws-garman-a100-demand
[4] Amazon's chip business could be worth $50 billion, Jassy says ... - TNW — reactive:aws-garman-a100-demand
[5] Launching our H100 1 Year Rental Price Index - SemiAnalysis — reactive:aws-garman-a100-demand
[6] Nvidia Blackwell GPU Rental Hits $4.08/hr: 48% Surge [2026] — reactive:aws-garman-a100-demand
[7] Intel Exec Confirms CPU Price Increases For OEMs Amid Supply ... — reactive:aws-garman-a100-demand
[8] Intel is raising CPU prices again in May 2026. This will be the third ... — reactive:aws-garman-a100-demand
[9] AMD CEO Lisa Su Says Server CPU Demand "Far Exceeded ... — reactive:aws-garman-a100-demand
[10] 2026 HBM Outlook: HBM4 Delays and HBM3e Dominance | TrendForce — reactive:aws-garman-a100-demand
[11] Memory shortage set to run until 2027 as chipmakers focus on AI - Nikkei Asia — reactive:aws-garman-a100-demand
[12] Samsung and SK hynix warn AI-driven memory shortages could last ... — reactive:aws-garman-a100-demand
[13] Samsung sells out 2026 HBM supply after starting Nvidia shipments in Q3 - KED Global — reactive:aws-garman-a100-demand
[14] Micron in High-Volume Production of HBM4 Designed for NVIDIA ... — reactive:nvidia-vera-computex-launch
[15] Micron's reported HBM4 delay could cede AI chip advantage to Samsung and SK Hynix — reactive:micron-hbm-bull-case
[16] Nvidia reportedly sets 4Q26 target for 16-high HBM supply - digitimes — reactive:aws-garman-a100-demand
[17] NVIDIA drives demand for 16-layer HBM, pressuring Samsung, SK ... — reactive:aws-garman-a100-demand
[18] Nvidia Requests 16-High HBM Chips from Suppliers by Q4 2026 | Michael Nitkowski posted on the topic | LinkedIn — reactive:aws-garman-a100-demand
[19] NVIDIA Rubin Architecture Triggers HBM4 Redesigns and Technical ... — reactive:aws-garman-a100-demand
[20] NVIDIA Rubin Enters Full Production | Introl Blog — reactive:aws-garman-a100-demand
[21] NVIDIA's "Rubin Ultra" Reportedly Faces Issues With CoWoS-L ... — reactive:aws-garman-a100-demand
[22] How TSMC is Doubling CoWoS Capacity to Break the AI Supply ... — reactive:aws-garman-a100-demand
[23] CoWoS Capacity Set to Skyrocket by 2026: Massive Growth in Advanced Packaging | SemiWiki — reactive:ai-demand-bubble-debate
[24] Inside the AI Bottleneck: CoWoS, HBM, and 2–3nm ... — reactive:ai-demand-bubble-debate
[25] AMD Confirms MI500 AI Accelerator for 2027: 2nm Node, CDNA 6 ... — reactive:aws-garman-a100-demand
[26] AMD's 2026-2027 AI Roadmap: Instinct MI400 & MI500 ... - Wccftech — reactive:aws-garman-a100-demand
[27] SK hynix holds 62% of HBM, Micron overtakes Samsung, 2026 ... — reactive:micron-hbm-bull-case
[28] Samsung Electronics Overtakes Micron to Reclaim Second Place in ... — reactive:micron-hbm-bull-case
[29] Intel Shifts Production to Xeon Amid AI Workload Shortages - LinkedIn — reactive:aws-garman-a100-demand
[30] Intel surges on a blowout Q1, ranks the highest forward P/E of any ... — reactive:aws-garman-a100-demand
[31] AMD gains CPU share as Intel fights supply squeeze — reactive:aws-garman-a100-demand
[32] Intel's dominance is slipping as Ryzen and EPYC processors gain ... — reactive:aws-garman-a100-demand
[33] Intel’s Supply Issues Helped AMD Grab Record-High CPU Market Share: Researcher — reactive:aws-garman-a100-demand
[34] Micron Q2 2026: Record Revenue on AI Boom [Analysis] — reactive:aws-garman-a100-demand
[35] Micron Stock Up 120% YTD: What the HBM Memory Leader Plans ... — reactive:aws-garman-a100-demand
[36] Samsung Claims Lead in Race to Ship AI Chips to Nvidia - Bloomberg — reactive:aws-garman-a100-demand
[37] @indiaesh @r0ck3t23 **Fact check: Accurate on the core claim.** — reactive:aws-garman-a100-demand (2026-04-30)
[38] Amazon Tripled Its CPU Servers and Still Ran Out as Agentic AI Gobbles Up Every Available Processor in the Cloud — reactive:aws-garman-a100-demand
[39] Amazon to Sell Trainium AI Chips to Third Parties - LinkedIn — reactive:aws-garman-a100-demand
[40] Intel Q1 FY 2026 Earnings: Agentic CPU Demand, Foundry Upside - Futurum — reactive:aws-garman-a100-demand
[41] Intel soars on signs AI boom for CPUs is here - Reuters — reactive:aws-garman-a100-demand
[42] AMD says its Instinct MI500 AI Accelerator will come in 2027 — reactive:aws-garman-a100-demand
[43] Intel Just Gifted AMD Its Strongest Buy Signal Yet — reactive:aws-garman-a100-demand
[44] AWS raises GPU prices 15% on a Saturday • The Register — reactive:aws-garman-a100-demand
[45] AWS Hikes EC2 Capacity Block Rates by 15% in Uniform ML Pricing Adjustment - InfoQ — reactive:aws-garman-a100-demand
[46] Unpacking AMD's latest datacenter CPU and GPU announcements — reactive:aws-garman-a100-demand
[47] Micron Stock Rises. Why Samsung's Claim to Be 'First' on ... - Barron's — reactive:aws-garman-a100-demand
[48] Intel prioritizes Xeon chips for data centers as consumer processor ... — reactive:aws-garman-a100-demand
[49] NVv4 series retirement - Azure Virtual Machines | Azure Docs — reactive:aws-garman-a100-demand
[50] Meta Deploys Tens of Millions of AWS Graviton5 Cores — reactive:aws-garman-a100-demand
[51] Industry Experts Warn Current DRAM Shortage Could Last Until 2030 — reactive:aws-garman-a100-demand