AWS CEO: AI Compute Demand So Strong No A100 Server Has Ever Been Retired · history
Version 4
2026-04-28 04:09 UTC · 111 items
Narrative
AWS CEO Matt Garman's April 26, 2026 claim that AWS has never retired a single Nvidia A100 server continues to anchor this thread, but two new anchoring data points have substantially expanded the analytical frame: SemiAnalysis's formalized H100 rental price index documenting a nearly 40% surge in Nvidia H100 rental prices over six months[1][2], and a Reuters exclusive reporting that Amazon CEO Andy Jassy projects AI will double prior AWS sales projections to $600 billion by 2036[3][4]. The H100 price data has propagated from SemiAnalysis's newsletter into mainstream media, YouTube[5], Reddit's r/NVDA_Stock[6], a YouTube podcast on the GPU shortage crisis[7], and — most institutionally significant — Polymarket prediction contracts now running on where H100 rental prices will land by April 30, 2026[8][9]. Prediction markets treating GPU rental prices as a tradeable variable represents a qualitative shift: the GPU compute market is now being priced with the same infrastructure as commodity and financial markets. Milk Road AI extended social media amplification of Garman's A100 claim onto Threads[10], and LinkedIn commentary linked the January 2026 AWS price increase directly to the AI demand signal[11].
The Azure VM retirement documentation has been further consolidated without changing its fundamental character. Microsoft Learn pages formally confirm NVv4 (AMD Radeon MI25-based) retirement by September 30, 2026[12][13], with LinkedIn amplification[14], Azalio[15], and Dizzion[16] all publishing third-party documentation of the same deadline. A separate Microsoft Q&A thread addresses broader VM migration to latest-generation hardware ahead of a November 15, 2028 deadline[17], and Helient published a comprehensive Azure VM 2028 retirement timeline and migration guide[18] — establishing that Azure is running parallel retirement tracks at different timescales across different VM families. The previous synthesis's core finding holds: Azure's active GPU retirements are of older AMD and pre-A100 Nvidia series, not the A100-based NDasrA100_v4, which remains documented and active.
Three materially new developments have emerged this cycle. First, a cloud repatriation narrative has crystallized across multiple independent sources: New Server Life published direct coverage of enterprises moving AI and heavy workloads back on-premise in 2026[19]; Intelegain documented 'cloud bill shock' from AI workloads draining IT budgets[20]; Clanker Cloud published an enterprise AI workstation vs. cloud cost analysis showing competitive on-prem economics[21]; and Next Platform argued that GPU hours is no longer the right unit for measuring AI training costs, signaling methodological maturation in enterprise infrastructure evaluation[22]. Together, these constitute the clearest concrete manifestation to date of the demand-destruction risk tension: high cloud GPU prices are generating documented enterprise migration back to on-premise, not just theoretical concern. Second, Business Standard reports Amazon is actively considering selling its in-house AI chips — Trainium and Inferentia — to external companies amid the demand surge[23]. If realized, this would make AWS a merchant silicon vendor competing with Nvidia, fundamentally complicating the supply side of the scarcity narrative Garman himself articulated. Third, Blackwell-generation GPU pricing (B200, B300, DGX systems) is now being documented[24], establishing the cost context for the generation that will eventually succeed A100 and H100 in the demand mix and against which A100 price-performance will ultimately be measured.
The overall discourse has evolved from a reactive cluster around a single CEO operational claim into a multi-front debate with genuine institutional weight. The November 2025 $38 billion OpenAI-AWS deal[25] provides retrospective context for why A100 demand has not abated: major AI labs are locked into AWS at multi-year scale. Jassy's $600 billion AWS revenue projection by 2036[3][4] sets the scale of the bull case against which telecom-bust analogies and the cloud repatriation signal must be weighed. The GPU pricing tracking infrastructure is now formalized across SemiAnalysis indexes, Polymarket contracts, Yahoo Finance prediction aggregation, and mainstream media amplification — the market is being watched with the systematic rigor previously reserved for energy or semiconductor commodity markets.
Timeline
- 2025-11-03: Amazon closes a $38 billion cloud deal with OpenAI on AWS, locking a major AI lab into AWS infrastructure at multi-year scale and providing retrospective context for sustained A100 demand. [25]
- 2026-01-05: AWS raises EC2 Capacity Block prices 15% in a uniform ML pricing adjustment, widely interpreted as demand-driven; subsequent analysis notes it reflects dynamic pricing mechanics with variable real-world impact depending on reservation structure. [48][73][74][50][51][11]
- 2026-02-01: AWS announces EC2 Capacity Blocks can now be shared across multiple accounts, easing enterprise multi-account ML infrastructure management. [77]
- 2026-03-17: Reuters publishes exclusive: Amazon CEO Andy Jassy projects AI will double prior AWS sales projections to $600 billion by 2036, setting the scale of the bull case for AI infrastructure demand. [3][4]
- 2026-04-09: Business Standard reports Amazon is considering selling in-house AI chips (Trainium, Inferentia) to external companies amid the AI demand surge — a potential pivot that would make AWS a merchant silicon vendor. [23]
- 2026-04-23: Next Platform publishes 'Stop Measuring AI Training Costs In GPU Hours,' signaling a methodological shift in how enterprises evaluate AI compute spending beyond simple GPU-hour metrics. [22]
- 2026-04-26: AWS CEO Matt Garman publicly states AWS has never retired a single Nvidia A100 server and is completely sold out of A100 capacity, citing persistent demand exceeding supply even for older GPU generations. [26][27][36][78][69]
- 2026-04-26: Statement rapidly amplified across X, LinkedIn, Reddit, SemiWiki, and Threads; investment commentary frames it as the definitive AI infrastructure demand signal. [37][79][80][81][82][42][43][10]
- 2026-04-27: Azure VM retirement documentation consolidates: NVv4 (AMD Radeon MI25) and NVv3 (older Nvidia Tesla) series confirmed for September 30, 2026 retirement across Microsoft Learn, LinkedIn, Azalio, and Dizzion; Helient documents a parallel 2028 retirement track for other VM families. A100-based NDasrA100_v4 series remains active. [31][32][33][34][35][14][12][13][15][16][17][18]
- 2026-04-27: SemiAnalysis launches H100 one-year rental price index documenting nearly 40% surge in H100 rental prices over six months; data propagates to MSN, YouTube, Reddit r/NVDA_Stock, and Polymarket prediction markets tracking H100 prices by April 30. [64][7][65][5][6][8][66][1][2]
- 2026-04-27: Cloud repatriation narrative crystallizes: multiple independent sources document enterprises moving AI workloads back on-premise due to escalating cloud GPU costs, including New Server Life, Intelegain, and Clanker Cloud cost analyses. [21][19][20][52]
- 2026-04-27: Continued investment and social media amplification; YouTube commentary argues AWS, Microsoft, and Google may be pricing themselves out of AI; YouTube video frames Amazon's broader AI chip and cloud business as a hidden investment opportunity. [44][68][30]
- 2026-04-27: NVIDIA Blackwell-generation GPU pricing (B200, B300, DGX systems) documented, establishing next-generation cost context as A100 and H100 demand evolves. [24]
Perspectives
Matt Garman, CEO of AWS
AI compute demand structurally exceeds supply across all GPU generations, including legacy hardware. AWS is completely sold out of A100 capacity and has never retired one. Demand is 'almost insatiable.'
Evolution: Consistent on demand posture. Now backed by Amazon CEO Andy Jassy's $600B AWS revenue projection by 2036 and the $38B OpenAI deal, which together institutionalize the bull case Garman's operational claim implied.
Andy Jassy, CEO of Amazon
AI will double prior AWS sales projections, reaching $600 billion in AWS revenue by 2036. Amazon is also exploring selling in-house AI chips (Trainium, Inferentia) externally amid the demand surge.
Evolution: New to this synthesis as a distinct voice — Jassy's $600B forecast and the chip externalization report both emerged or were surfaced in this cycle and substantively extend the Amazon narrative beyond Garman's operational claim.
Microsoft Azure
Azure is retiring NVv4 (AMD MI25-based) and NVv3 (older Nvidia Tesla-based) VM series by September 30, 2026. A separate 2028 retirement track covers additional older VM families. The A100-based NDasrA100_v4 series remains documented and active.
Evolution: Further documented — the previous synthesis corrected an overstatement of the Azure A100 retirement claim; this cycle adds more official Microsoft Learn pages and third-party sources confirming the NVv4/NVv3 retirement scope, plus a new 2028 retirement timeline for other series.
Investment and financial commentary (Milk Road AI, The AI Investor, LEAPTRADER, SpecialSitsNews, Barclays, InvestorPlace, Seeking Alpha, Polymarket)
The A100 retirement claim is a landmark demand signal. GPU pricing is now tracked via formalized instruments: SemiAnalysis's H100 index, Polymarket prediction contracts, and Yahoo Finance prediction aggregation. Polymarket running formal contracts on H100 prices by April 30 represents institutional treatment of GPU pricing as a tradeable variable.
Evolution: Significantly upgraded — prediction markets (Polymarket) are now tracking H100 GPU rental prices as a formal contract event, representing a qualitative escalation from analyst commentary to market-priced expectations.
Enterprise practitioners and cloud architects
GPU capacity constraints are a real operational problem — on-demand instances are unreliable, Capacity Blocks and reservations are required, and pricing has risen sharply. Cloud bill shock from AI workloads is a documented phenomenon in 2026. A growing cohort is executing or evaluating cloud repatriation: moving AI workloads back on-premise as hyperscaler GPU costs escalate.
Evolution: Materially updated — cloud repatriation has moved from a theoretical response to a documented enterprise behavior, with multiple independent sources covering it in April 2026 as a current trend rather than a future possibility.
AI bubble skeptics and value investors (Hacker News, Reddit, Futuriom, Latticework/MOI Global, Substack, LinkedIn 'Treadmill')
The telecom-bust analogy now has three independent articulations. Cloud repatriation data now provides concrete enterprise-level demand-destruction evidence to support the structural critique — enterprises are not merely raising concerns about GPU costs, they are acting on them by moving workloads off cloud. Fortune Magazine amplification signals mainstream financial press engagement.
Evolution: Strengthened by cloud repatriation data — the skeptic thesis has acquired real-world behavioral evidence rather than relying solely on historical analogy. The 'demand-destruction loop' framing introduced in the prior synthesis now has empirical grounding.
GPU pricing analysts (SemiAnalysis, Silicon Data CEO Carmen Li, Cast AI, Spheron, Fusion Worldwide)
Nvidia H100 GPU rental prices have surged nearly 40% in six months per SemiAnalysis's formalized index[1][2]. The H100 price surge is now being tracked in mainstream media, prediction markets, and YouTube. Blackwell-generation pricing is beginning to be documented as the successor cost context.
Evolution: SemiAnalysis data has now achieved mainstream amplification (MSN, YouTube, Reddit NVDA_Stock, Polymarket) and has been formalized into an institutional index. The 40% figure is the dominant GPU pricing benchmark in current coverage.
YouTube and video commentary
'AWS, Microsoft, and Google Are Pricing Themselves Out of AI' introduces demand-destruction risk from hyperscaler price escalation. A separate video frames Amazon's broader AI chip and cloud business as a hidden investment opportunity. YouTube shorts document H100 rental price increases.
Evolution: Expanded — video commentary now covers both the demand-destruction risk angle and a bullish Amazon AI business framing, representing both camps in video format.
Trade press (Data Center Dynamics, The Register, Network World, InfoQ, IT Pro, MSN, Next Platform)
Factual reporting on the GPU capacity shortage as a structural story. MSN amplification of SemiAnalysis's 40% H100 price surge signals the pricing story has reached mainstream tech media. Next Platform's methodological argument about moving beyond GPU-hour cost metrics suggests the analytical frame for AI infrastructure costs is itself evolving.
Evolution: Expanded reach — the GPU pricing angle has moved from trade press to mainstream tech media (MSN) and from operational reporting to methodological critique (Next Platform).
Tensions
- Is the A100 demand signal evidence of durable structural AI enterprise adoption, or does it reflect a speculative overbuild by a small number of hyperscale AI customers that could unwind if enterprise ROI disappoints? The $38B OpenAI-AWS deal and Jassy's $600B projection are the bull case; the cloud repatriation signal and telecom-bust analogies are the bear case. [27][53][54][71][38][39][55][57][3][25]
- Cloud repatriation as demand-destruction in action: multiple sources now document enterprises actively moving AI workloads back on-premise due to escalating cloud GPU costs, providing concrete behavioral evidence for the demand-destruction risk that was previously only theoretical. [21][19][22][20][52][68]
- Amazon's reported consideration of selling Trainium and Inferentia chips externally introduces a supply-side variable not present in the original framing: if AWS becomes a merchant silicon vendor, it could reshape the GPU supply landscape and complicate the scarcity narrative that Garman's A100 claim rests on. [23][26][27]
- Prediction markets (Polymarket) are now running formal contracts on H100 GPU rental prices by April 30, 2026 — treating GPU compute pricing as a tradeable market variable. Whether market-implied GPU price expectations align with or diverge from analyst forecasts and actual rental market data is an unresolved question. [9][8][1][2]
- The Azure-vs-AWS A100 retirement contrast is less clear than initially framed: Azure is retiring NVv4 (AMD MI25) and NVv3 (older Nvidia Tesla) VM series by September 2026 and other families by 2028, while the A100-based NDasrA100_v4 series remains active — the divergence from AWS's posture is real but narrower and more complex than A100-specific divergence. [31][32][33][34][14][12][13][15][16][17][18][26]
- GPU price escalation may be self-defeating: H100 rental prices up nearly 40% in six months, combined with cloud bill shock documentation and cloud repatriation behavior, suggests the hyperscaler pricing strategy is already suppressing the enterprise AI adoption that justifies the infrastructure buildout. [1][2][68][21][19][20][63]
- As A100 and H100 demand remains structurally elevated, Blackwell-generation GPU pricing is beginning to be documented — raising the question of whether next-generation pricing will accelerate enterprise repatriation or whether Blackwell supply normalization will relieve the current pricing pressure that is driving it. [24][67][64][19]
- With hyperscaler AI capex toward $700 billion and three independent analytical frameworks applying the 1990s telecom bust analogy, the question of whether AI infrastructure could become stranded assets is moving from fringe to mainstream financial discourse — while Jassy's $600B AWS revenue projection by 2036 asserts the demand is real and durable. [41][58][59][60][61][62][57][3][4]
- GPU pricing concentration risk: the shift to Capacity Block reservation models and a small number of large customers attempting to buy out entire AWS GPU capacity suggests smaller enterprises and startups are being crowded out — with cloud repatriation now documented as the behavioral outcome for those priced out of the cloud tier. [48][75][46][47][76][19][20]
Sources
- [1] Nvidia's H100 GPU rental prices surge nearly 40% in 6 months - MSN — reactive:aws-garman-a100-demand
- [2] Launching our H100 1 Year Rental Price Index - SemiAnalysis — reactive:aws-garman-a100-demand
- [3] Exclusive: Amazon CEO sees AI doubling prior AWS sales ... - Reuters — reactive:aws-garman-a100-demand
- [4] Exclusive-Amazon CEO sees AI doubling prior AWS sales ... — reactive:aws-garman-a100-demand
- [5] Nvidia H100 GPU rental price increase in 2026 - YouTube — reactive:aws-garman-a100-demand
- [6] The Great GPU Shortage – Rental Capacity [Semi Analysis] - Reddit — reactive:aws-garman-a100-demand
- [7] Global GPU Shortage Crisis | H100 Rental Prices Surge 40% — reactive:aws-garman-a100-demand
- [8] GPU rental prices (H100) hit___ by April 30? Trading Odds & Predictions 2026 | Polymarket — reactive:aws-garman-a100-demand
- [9] GPU rental prices (H100) hit___ by April 30? - Yahoo Finance — reactive:aws-garman-a100-demand
- [10] And AWS has never once taken an A100 offline because demand is ... — reactive:aws-garman-a100-demand
- [11] AI Compute Costs Rise: AWS GPU Prices Increase 15% | Mutha Nagavamsi posted on the topic | LinkedIn — reactive:aws-garman-a100-demand
- [12] NVv4 series retirement - Azure Virtual Machines | Microsoft Learn — reactive:aws-garman-a100-demand
- [13] Retired Azure VM size series - Virtual Machines - Microsoft Learn — reactive:aws-garman-a100-demand
- [14] Azure to retire NVv4-series VMs on September 30, 2026 - LinkedIn — reactive:aws-garman-a100-demand
- [15] Retirement: NVv4-series Azure Virtual Machines will be ... - Azalio — reactive:aws-garman-a100-demand
- [16] Azure NV Series EOL Announcement - Dizzion Support Center — reactive:aws-garman-a100-demand
- [17] Migrate your retiring Azure Virtual Machines (VMs) to latest-generation VMs before 15 November 2028 - Microsoft Q&A — reactive:aws-garman-a100-demand
- [18] Azure VM Size Series Retirement 2028: Timeline and Migration ... — reactive:aws-garman-a100-demand
- [19] Cloud Repatriation in 2026: Moving AI & Heavy Workloads On-Prem — reactive:aws-garman-a100-demand
- [20] Cloud Bill Shock in 2026: How AI Workloads Are Draining IT Budgets — reactive:aws-garman-a100-demand
- [21] Enterprise AI Workstation vs Cloud Cost Analysis 2026 — What the Numbers Actually Say | Clanker Cloud Blog — reactive:aws-garman-a100-demand
- [22] Stop Measuring AI Training Costs In GPU Hours - The Next Platform — reactive:aws-garman-a100-demand
- [23] Amazon considers selling in-house AI chips to firms amid demand surge | Company News - Business Standard — reactive:aws-garman-a100-demand
- [24] NVIDIA Blackwell GPU Pricing: B200, B300 and DGX Cost Breakdown — reactive:aws-garman-a100-demand
- [25] Amazon closes at record after $38 billion OpenAI deal with AWS — reactive:aws-garman-a100-demand
- [26] AWS CEO Matt Garman: "Because there is so much more demand than supply, there typically still is demand for the older ch… — Rohan Paul Twitter (2026-04-26)
- [27] Matt Garman, CEO of AWS, Amazon's $100+ billion cloud division and what he just said is the single most important data p… — Milk Road AI Twitter (2026-04-26)
- [28] AWS CEO Says Compute Demand 'Almost Insatiable' — reactive:aws-garman-a100-demand
- [29] AWS CEO Garman said space data centers likely to take longer — reactive:aws-garman-a100-demand
- [30] Amazon's Hidden AI Business Could Change Everything for Investors — reactive:aws-garman-a100-demand
- [31] NVv4 series retirement - Azure Virtual Machines | Azure Docs — reactive:aws-garman-a100-demand
- [32] Is the NVIDIA A100 VM Series Being Retired in Azure? - Microsoft Q&A — reactive:aws-garman-a100-demand
- [33] NDasrA100_v4 size series - Azure Virtual Machines | Microsoft Learn — reactive:aws-garman-a100-demand
- [34] Migrate your NVv3-series virtual machines by September 30, 2026 — reactive:aws-garman-a100-demand
- [35] Migrate your retiring Azure Virtual Machines (VMs) to latest ... - Reddit — reactive:aws-garman-a100-demand
- [36] AWS CEO Matt Garman said they have never retired an A100 server. — reactive:aws-garman-a100-demand (2026-04-26)
- [37] Amazon Web Services CEO Matt Garman said today there is so ... — reactive:aws-garman-a100-demand
- [38] AI capex ROI becomes key 2026 test for hyperscalers - Seeking Alpha — reactive:aws-garman-a100-demand
- [39] The AI Capex Debate: Misallocation or Generational ROIC? | InvestorPlace — reactive:aws-garman-a100-demand
- [40] The Flip Side podcast - Episode 82 | Barclays Investment Bank — reactive:aws-garman-a100-demand
- [41] Hyperscaler Capex Snowballs Toward $700B as Firms Stage AI Builds — reactive:aws-garman-a100-demand
- [42] AWS CEO Matt Garman said they have never retired an A100 server ... — reactive:aws-garman-a100-demand
- [43] Milk Road AI — reactive:aws-garman-a100-demand
- [44] $AMZN AWS CEO Matt Garman revealed that the company has never retired a single A100 server. — reactive:aws-garman-a100-demand (2026-04-27)
- [45] Launching GPU Instances on AWS: Understanding Capacity, Quotas, and Reservations — reactive:aws-garman-a100-demand
- [46] What AWS’s GPU Pricing Shift Reveals About Cloud Cost Risk - Amplix — reactive:aws-garman-a100-demand
- [47] How do you handle on-demand GPU instances for AI ... — reactive:aws-garman-a100-demand
- [48] AWS raises GPU prices 15% on a Saturday • The Register — reactive:aws-garman-a100-demand
- [49] The GPU Capacity Crisis: Why Enterprises Are Rethinking Where AI ... — reactive:aws-garman-a100-demand
- [50] AWS Raised Prices 15%? No, It's More Complicated Than That | LCMH - Digital Services — reactive:aws-garman-a100-demand
- [51] EC2 Capacity Blocks : r/aws — reactive:aws-garman-a100-demand
- [52] How Platform Engineering Impacts Cloud Costs in 2026 - Everyday IT — reactive:aws-garman-a100-demand
- [53] Amazon plunge continues $1T wipeout as AI bubble fears ignite sell ... — reactive:aws-garman-a100-demand
- [54] AWS CEO Matt Garman is pushing back hard on the idea ... - Reddit — reactive:aws-garman-a100-demand
- [55] The Real AI CapEx Problem No One Wants to Talk About — reactive:aws-garman-a100-demand
- [56] Concerns about the ai bubble and overbuilding capacity - Facebook — reactive:aws-garman-a100-demand
- [57] Hyperscaler AI Spending Doubts Rising - Futuriom — reactive:aws-garman-a100-demand
- [58] Are the Hyperscalers Turning Themselves into the Telecom ... — reactive:aws-garman-a100-demand
- [59] Parallels Between the Hyperscalers and the Telecom Firms of the 1990s | MOI Global — reactive:aws-garman-a100-demand
- [60] "This Time is Different": Will AI end up like the telecom bust? — reactive:aws-garman-a100-demand
- [61] The Treadmill: Why the AI Infrastructure Bet Breaks Every ... — reactive:aws-garman-a100-demand
- [62] 🔗: https://bit.ly/41P2wP4 The hyperscalers building the ... — reactive:aws-garman-a100-demand
- [63] AI Demand Boosts GPU Prices, Says Silicon Data CEO Carmen Li — reactive:aws-garman-a100-demand
- [64] Launching our H100 1 Year Rental Price Index - SemiAnalysis — reactive:aws-garman-a100-demand
- [65] Launching our H100 1 Year Rental Price Index | Sheng-Che Huang — reactive:aws-garman-a100-demand
- [66] The Great GPU Shortage – Rental Capacity — reactive:aws-garman-a100-demand
- [67] Cast AI Data Shows GPU Pricing Will See a Foundational Shift in 2026 — reactive:aws-garman-a100-demand
- [68] AWS, Microsoft, and Google Are Pricing Themselves Out of AI — reactive:aws-garman-a100-demand
- [69] AWS has “never retired” an Nvidia A100 server, CEO Matt Garman ... — reactive:aws-garman-a100-demand
- [70] AI demand is so high, AWS customers are trying to buy out its entire ... — reactive:aws-garman-a100-demand
- [71] We’re Using So Much AI That Computing Firepower Is Running Out — reactive:aws-garman-a100-demand
- [72] Amazon's retail business resolves internal GPU capacity shortage - DCD — reactive:aws-garman-a100-demand
- [73] AWS Hikes EC2 Capacity Block Rates by 15% in Uniform ML Pricing Adjustment - InfoQ — reactive:aws-garman-a100-demand
- [74] AWS just quietly increased EC2 Capacity Block prices – here's what you need to know | IT Pro — reactive:aws-garman-a100-demand
- [75] AWS EC2 Capacity Blocks Pricing Shifts to Certainty | Sanchit Vir Gogia posted on the topic | LinkedIn — reactive:aws-garman-a100-demand
- [76] AWS's GPU Price Hike Was Just the Opening Shot. Here's What's ... — reactive:aws-garman-a100-demand
- [77] Amazon EC2 capacity blocks for ML can be shared across multiple ... — reactive:aws-garman-a100-demand
- [78] AWS has “never retired” an Nvidia A100 server, CEO Matt Garman ... — reactive:aws-garman-a100-demand
- [79] Amazon Web Services CEO Matt Garman said today there is so ... — reactive:aws-garman-a100-demand
- [80] AWS CEO Matt Garman said today there is so much demand they ... — reactive:aws-garman-a100-demand
- [81] Amazon's Matt Garman: there is so much more demand than supply ... — reactive:aws-garman-a100-demand
- [82] SemiWiki.com's Post - LinkedIn — reactive:aws-garman-a100-demand