NVIDIA Launches Vera CPU and Vera Rubin NVL72 at COMPUTEX / GTC Taipei · history

Version 10

2026-06-01 02:17 UTC · 228 items

Changes since v9

Two developments materially update the prior synthesis. First, Dell delivered the world's first fully validated Vera Rubin NVL72 rack to CoreWeave on May 31, clearing L11 diagnostics (single-rack scale-up domain operational) [^22716][^22927][^22949] — the first time physical Rubin hardware has been placed in a cloud provider's hands, advancing the story from announcement to physical deployment milestone and introducing CoreWeave and Dell as a new first-mover voice in the arc. Second, SemiAnalysis published technical architecture details revealing Rubin's performance advantage is concentrated in low-precision workloads: FP4/FP8 FLOPs scale ~3.5x over GB200 while FP16 gains are only ~1.6x and HBM capacity remains flat [^22946], qualifying the headline efficiency claims and generating a new tension between NVIDIA's uniform marketing narrative and workload-specific performance reality.

What

NVIDIA's Vera Rubin platform has crossed its first concrete hardware deployment milestone: Dell delivered the world's first fully validated Vera Rubin NVL72 rack to CoreWeave, with L11 diagnostics passing to confirm the rack's internal scale-up domain is operational [4][5][6]. The system is not yet a full cluster — scale-out networking configuration (L12) remains ahead — but physical Rubin hardware is now in a cloud provider's hands for the first time [7]. The performance picture has also sharpened: Vera CPU shows 1.5x overall advantage over 128-core x86 and a 1.6x geometric mean improvement over NVIDIA's own Grace CPU [9][8], while SemiAnalysis documents Rubin GPU FP4/FP8 compute at ~3.5x the GB200 baseline with HBM bandwidth scaling ~2.8x — but FP16 gains of only ~1.6x [11].

Why it matters

The Dell-CoreWeave L11 validation moves Vera Rubin from announcement to physical deployment reality — the first production-validated NVL72 rack in a cloud provider's hands. The SemiAnalysis technical breakdown [11] reveals that Rubin's performance advantage is concentrated in low-precision (FP4/FP8) inference workloads, with FP16 gains of only ~1.6x over GB200 and flat HBM capacity — clarifying that efficiency claims are workload-specific rather than uniform, which matters for customers whose training or high-precision inference pipelines do not benefit equally from the headline figures.

Open questions

CoreWeave and Dell have achieved L11 validation (single-rack scale-up domain) for the NVL72 [4][6], but L12 full-cluster validation with scale-out networking has not yet been demonstrated — when does the first L12 milestone occur, and at what cluster scale?
SemiAnalysis documents Rubin's FP4/FP8 FLOPs at ~3.5x GB200 but FP16 at only ~1.6x, with HBM capacity flat [11] — for training workloads or precision-sensitive inference requiring FP16 or higher, does Rubin's advantage shrink to the point where GB200-era hardware remains competitive on a cost-per-useful-FLOP basis?
Phoronix benchmark results for Vera CPU appear via NVIDIA's promotional blog rather than as an independent Phoronix publication [8][9] — when does Phoronix publish its own unmediated review, and does the benchmark suite weight memory-bound workloads where Vera's LPDDR5X architecture excels?
Micron's IR press release asserts high-volume HBM4 production for Vera Rubin [26] while multiple sources conclude NVIDIA designated Samsung and SK Hynix as sole suppliers [27][28][29] — has Micron secured an actual production allocation or is the claim aspirational?

Narrative

NVIDIA's Vera Rubin platform — announced at CES 2026 in full production by Jensen Huang [1][2], featuring 336 billion transistors and 50 petaflops of AI performance [3] — has reached its first concrete deployment milestone. On May 31, 2026, Dell delivered the world's first fully validated Vera Rubin NVL72 rack to CoreWeave, with the system passing L11 diagnostics confirming the rack's internal NVLink/IMEX scale-up domain is operational [4][5][6]. In NVIDIA's three-stage validation hierarchy, L10 certifies single-server firmware and OS, L11 certifies a single rack or scale-up domain, and L12 certifies a full compute cluster with scale-out networking [7]. The next phase involves burning in multiple racks before software-level bringup using frameworks including sglang, vllm, and dynamo [4]. The system is not yet a full cluster — scale-out networking configuration remains ahead — but physical Rubin hardware is now in a cloud provider's hands for the first time.

The performance case for Vera CPU rests on Phoronix benchmark results published via NVIDIA's promotional blog: a 1.5x overall performance advantage over a current 128-core x86 processor, a 1.6x geometric mean improvement over NVIDIA's own Grace CPU, and 90% of rated peak memory bandwidth in STREAM TRIAD tests [8][9]. Phoronix founder Michael Larabel described Vera as 'the most formidable competition to Intel and AMD x86_64 processors ever realized from non-x86 architecture.' The gap between the 1.5x headline and the 10% geometric mean improvement over AMD EPYC 9575F reflects benchmark weighting: Vera leads decisively on memory-bound workloads and more narrowly on compute-bound tasks, while consuming under 30 watts of memory power versus over 100 watts for traditional DDR5. NVIDIA's own DGX Rubin NVL8 reference system simultaneously supports Intel Xeon 6 as an alternative host CPU option [10], implicitly acknowledging continued x86 relevance within the platform.

SemiAnalysis provides technical architecture context for the Rubin GPU: FP4 and FP8 FLOPs scale approximately 3.5x over GB200 while FP16 FLOPs increase by only ~1.6x; HBM bandwidth scales ~2.8x over GB300 while HBM capacity remains flat [11]. The platform uses a 3nm process with disaggregated I/O chiplets and two reticle-sized dies with eight HBM stacks, adopting a more modular design than Grace Blackwell for integration efficiency [11]. SemiAnalysis characterizes NVIDIA's competitive moat as vertical integration across all major silicon components in the AI server system — a position no other vendor currently matches. The FP4/FP8 versus FP16 asymmetry means Rubin's headline efficiency gains are concentrated in low-precision inference, not uniformly distributed across training or high-precision workloads.

The infrastructure commitments surrounding Vera Rubin are substantial and structurally complex. NVIDIA's confirmed ~$674M equity stake in European cloud provider Nscale [12][13] means it simultaneously functions as chip designer, financial backer of the primary deployment operator, and co-party to Microsoft's infrastructure commitments — which encompass 130,000 Rubin GPUs at Start Campus Portugal and a 1.35GW Letter of Intent for Nscale's West Virginia Monarch Compute Campus [14][15]. A complete OEM tier covers both form factors: Dell, Supermicro, HPE, and PEGATRON for the NVL72 [16][17][18], and Compal, PEGATRON, Aivres, and Supermicro for the HGX Rubin NVL8 [19][16][20]. Two structural constraints persist: HBM4 supply shortage projected to last until 2028 [21], with rack prices reaching $8.8M [22][23]; and the NVL72's 600kW per-rack power requirement that makes most existing data center infrastructure physically incompatible, mandating greenfield construction [24][25].

Timeline

2026-01-05: NVIDIA debuts Rubin chip at CES: 336 billion transistors, 50 petaflops AI performance [3]
2026-01: Jensen Huang announces at CES 2026 keynote that Vera Rubin NVL72 is in full production [1][2]
2026-02: SK Hynix begins HBM4 mass production shipments to NVIDIA; holds approximately 70% of NVIDIA's HBM4 orders [49][44]
2026-03-17: Nscale acquires 8GW Monarch Compute Campus in West Virginia; Microsoft signs 1.35GW LOI co-announced with NVIDIA and Caterpillar [50][36][15][48]
2026-05-18: First Vera CPUs hand-delivered to OpenAI, Anthropic, and other leading AI labs [51][52][53]
2026-05-18: Jensen Huang keynotes Dell Technologies World: projects $3–4T AI infrastructure buildout by 2030 and flags memory supply as primary bottleneck [30][54][32]
2026-05-21: NVIDIA reports Q1 2026 earnings: $81.6B revenue, up 85% year-over-year [31][55]
2026-05-21: NVIDIA GTC Taipei: Vera Rubin NVL72 wins Computex Best Choice Golden Award; Meta, Google Cloud, and Microsoft formalize partnerships [56][57][33][34][35]
2026-05: Samsung sells out entire 2026 HBM4 supply; rack prices reach $8.8M; shortage projected to persist until 2028 [45][22][23][21]
2026-05: Multiple analyses document Vera Rubin NVL72's 600kW per-rack power requirement as incompatible with existing data centers, requiring greenfield construction [24][25][46]
2026-05: NVIDIA equity stake in Nscale confirmed at approximately $674M within a $1.1B funding round [12][13][58]
2026-05: Microsoft's Rubin GPU deployment via Nscale revised upward to 130,000 units, including Start Campus Portugal (200MW) [14][59][38][39]
2026-05: Compal, PEGATRON, Aivres, and Supermicro announce HGX Rubin NVL8 systems; NVIDIA DGX Rubin NVL8 supports Intel Xeon 6 as alternative host CPU alongside Vera [20][19][16][10][17]
2026-05-26: Phoronix benchmarks of Vera CPU published via NVIDIA blog: 1.5x overall x86 advantage, 1.6x geometric mean over Grace CPU, 90% peak bandwidth utilization [8][9]
2026-05-31: Dell delivers world's first fully validated Vera Rubin NVL72 rack to CoreWeave; L11 diagnostics confirm scale-up domain operational, full-cluster L12 validation with scale-out networking remains ahead [4][5][6][7]
2026-06-01: COMPUTEX 2026 main event scheduled (June 1–5) [60]

Perspectives

NVIDIA / Jensen Huang

Maximally bullish: Q1 2026 earnings ($81.6B, +85% YoY) validate 'parabolic' AI demand; Phoronix benchmark results confirm Vera CPU's generational leadership over x86; NVIDIA is co-party to the West Virginia infrastructure deal and confirmed equity holder in Nscale.

Evolution: Consistent; Dell's delivery of the first validated NVL72 rack to CoreWeave advances the deployment narrative from announcement to physical milestone.

[30][31][32][33][34][35][36][37][8]

Dell / CoreWeave

Dell delivered the world's first fully validated Vera Rubin NVL72 rack to CoreWeave, clearing L11 diagnostics; CoreWeave is now the first cloud provider with a validated NVL72 system, with multi-rack burn-in and software bringup as the next phase.

Evolution: New voice in the arc — CoreWeave and Dell emerge as first-mover operators on the deployment timeline, distinct from the hyperscaler LOI commitments that had previously dominated the deployment narrative.

[4][5][6]

Microsoft / Nscale

Anchor customer for Vera Rubin NVL72 across two continents via Nscale: 130,000 Rubin GPUs at Start Campus Portugal and a 1.35GW LOI for the West Virginia Monarch Compute Campus, co-announced with NVIDIA.

Evolution: Consistent; NVIDIA's confirmed ~$674M equity in Nscale means deployment announcements co-publicized by both companies are not arm's-length commercial transactions.

[35][38][39][15][14][12][13]

OEM ecosystem (Dell, Supermicro, HPE, Compal, PEGATRON, Aivres)

A complete multi-vendor OEM tier covers both the NVL72 (Dell, Supermicro, HPE, PEGATRON) and the HGX Rubin NVL8 (Compal, PEGATRON, Aivres, Supermicro); Dell's delivery of the first validated NVL72 rack to CoreWeave marks the OEM tier's first production milestone.

Evolution: Dell has moved from system announcement to first physical delivery — the most concrete OEM execution milestone in the arc.

[40][41][18][20][19][17][16][10][4][5]

SemiAnalysis (technical analysis)

NVIDIA's competitive moat stems from vertical integration across all major silicon components; Rubin's FP4/FP8 FLOPs scale ~3.5x over GB200 while FP16 gains are only ~1.6x and HBM capacity remains flat — Rubin is optimized for low-precision inference bandwidth, not uniformly superior across all workload types.

Evolution: Consistent analytical framing but newly granular: the FP4/FP8 versus FP16 asymmetry and flat HBM capacity qualify the headline efficiency claims in ways NVIDIA's marketing does not.

[4][11][6][42][7]

Micron

Officially asserts high-volume HBM4 production specifically designed for NVIDIA Vera Rubin via an investor relations press release, directly contradicting industry reports concluding NVIDIA designated Samsung and SK Hynix as sole suppliers.

Evolution: Consistent; the contradiction with multiple independent sources remains unresolved.

[26][43][27][28][29]

Memory and supply chain analysts

HBM4 shortage is the binding structural constraint: SK Hynix holds ~70% of NVIDIA's orders, Samsung has sold out 2026 supply, rack prices have surged to $8.8M, and shortage is projected to persist until 2028.

Evolution: Consistent; deployment commitments now documented at scale increase the stakes of whether memory supply can support announced timelines.

[44][21][45][22][23]

Data center infrastructure analysts

Vera Rubin NVL72's 600kW per-rack power requirement is a fundamental incompatibility with existing data center infrastructure — not a retrofit problem but a greenfield requirement — establishing a second binding structural bottleneck alongside HBM4 supply.

Evolution: Consistent; the constraint remains unaddressed in NVIDIA's public deployment announcements.

[24][25][46][47]

Tensions

Micron's official IR press release states high-volume HBM4 production for NVIDIA Vera Rubin [26], while TechPowerUp, a Substack analysis, and a video explainer all conclude NVIDIA designated only Samsung and SK Hynix as HBM4 suppliers [27][28][29] — two claims that are mutually incompatible unless they refer to different allocation tiers. [26][27][28][29]
NVIDIA markets Vera Rubin on a 10x cost-per-token reduction, but SemiAnalysis documents that Rubin's FP4/FP8 gains over GB200 are ~3.5x while FP16 gains are only ~1.6x and HBM capacity is flat [11], and real rack economics show a 485% memory price surge and 600kW power requirements incompatible with most existing facilities [22][24] — the efficiency claim is workload-specific, not uniform. [11][22][23][24][25]
NVIDIA confirmed as holding approximately $674M in Nscale equity [13] while publicly describing Nscale only as a commercial partner — if NVIDIA holds substantial equity in the cloud provider deploying its chips at scale, the deployment announcements and pricing co-publicized by both companies are not arm's-length commercial transactions [37][15]. [37][12][13][15][36]
NVIDIA positions Vera CPU as the purpose-built replacement for x86 in agentic AI workloads [8][9], while NVIDIA's own DGX Rubin NVL8 reference system supports Intel Xeon 6 as a host CPU option alongside Vera [10] — implicitly acknowledging continued x86 relevance within NVIDIA's own platform architecture. [8][9][10]
The West Virginia Monarch Compute Campus deal is structured as a Letter of Intent [15] rather than a binding contract, while Nscale and NVIDIA have co-publicized it in terms implying a firm 1.35GW commitment — and NVIDIA's confirmed equity in Nscale adds a further complication to assessing the independence of the parties [13]. [15][36][48][13]
Dell and SemiAnalysis frame the CoreWeave NVL72 rack delivery as the 'world's first fully validated' system [4][5], but L11 diagnostics confirm only a single-rack scale-up domain — scale-out networking and full-cluster L12 validation have not yet been demonstrated [6][7], a distinction that matters for assessing real deployment readiness. [4][5][6][7]

Sources

[1] Nvidia CEO confirms Vera Rubin NVL72 is now in production — reactive:nvidia-vera-computex-launch
[2] NVIDIA Vera Rubin AI Platform Hits Full Production CES 2026 ... — reactive:nvidia-vera-computex-launch
[3] Nvidia debuts Rubin chip with 336B transistors and 50 petaflops of AI performance - SiliconANGLE — reactive:nvidia-vera-computex-launch
[4] BREAKING NEWS: COREWEAVE & DELL IS THE FIRST CLOUD TO ANNOUNCE THAT THEY HAVE RUBIN VR200 NVL72 WITH FULLY PASSING L… — SemiAnalysis Twitter (2026-05-31)
[5] Dell just made history this weekend and it is the culmination of an execution streak that no other company in enterprise… — Milk Road AI Twitter (2026-05-31)
[6] Notably, passing L11 diags means that this rack is up and running, including the IMEX channels on the NVL72 scale-up dom… — SemiAnalysis Twitter (2026-05-31)
[7] At L10 your Firmware/BIOS and OS works on a single server, at L11 a single rack or scale-up domain works, and then at L1… — SemiAnalysis Twitter (2026-05-31)
[8] NVIDIA Vera CPU Is ‘Packing a Heavy-Hitting Punch’ Against Competition — NVIDIA Blog (2026-05-26)
[9] NVIDIA published a report on Vera CPU benchmarks, done by Phoronix. — Rohan Paul Twitter (2026-05-28)
[10] NVIDIA DGX Rubin NVL8 Supports Intel Xeon 6 as Host CPU Option for x86-Based AI Inference - StorageReview.com — reactive:nvidia-vera-computex-launch
[11] for more details on Nvidia's VR NVL72 Oberon and future roadmap, check out our article from February: — SemiAnalysis Twitter (2026-05-31)
[12] Nvidia-backed UK AI firm Nscale raises $1.1 billion funding round — reactive:nvidia-vera-computex-launch
[13] UK AI Infrastructure Startup Nscale Receives $674 Million (£500 ... — reactive:nvidia-vera-computex-launch
[14] 130,000 Rubin GPUs Are Being Deployed at Nscale For Microsoft, Further Showing Massive Interest In NVIDIA's Next-Gen AI Chips — reactive:nvidia-vera-computex-launch
[15] Nscale acquires 8GW Monarch Compute Campus, Microsoft signs on for 1.35GW of compute - DCD — reactive:nvidia-vera-computex-launch
[16] PEGATRON Unveils Next Generation AI Platforms Powered by NVIDIA Vera Rubin NVL72 and NVIDIA HGX Rubin NVL8 at GTC 2026 | PEGATRON SVR — reactive:nvidia-vera-computex-launch
[17] Supermicro Reveals DCBBS® with New NVIDIA Vera Rubin NVL72 ... — reactive:nvidia-vera-computex-launch
[18] Pioneering the next era of gigascale AI with NVIDIA Vera Rubin ... — reactive:nvidia-vera-computex-launch
[19] Compal Introduces High-Density NVIDIA HGX™ Rubin NVL8 Integrated Solution at GTC 2026 — reactive:nvidia-vera-computex-launch
[20] KR5288 Rubin | 5U NVIDIA HGX™ Rubin NVL8 Server - Aivres — reactive:nvidia-vera-computex-launch
[21] SK Hynix Surges 15% to New High: HBM Shortage Until 2028, How Much Longer Can AI Memory King Rise? — reactive:nvidia-vera-computex-launch
[22] Nvidia's memory costs soar 485%, latest AI systems now cost $7.8 ... — reactive:nvidia-vera-computex-launch
[23] Price of Nvidia's Vera Rubin NVL72 racks skyrockets to as much as $8.8 million apiece, but server makers' margins will be tight — Nvidia is moving closer to shipping entire full-scale systems — reactive:nvidia-vera-computex-launch
[24] The Data Center Isn't Ready. NVIDIA's Vera Rubin platform ships in… — reactive:nvidia-vera-computex-launch
[25] NVIDIA Vera Rubin: 600kW Racks by 2027 | Introl Blog — reactive:nvidia-vera-computex-launch
[26] Micron in High-Volume Production of HBM4 Designed for NVIDIA ... — reactive:nvidia-vera-computex-launch
[27] Micron Is Locked Out of HBM4 in NVIDIA's Vera Rubin Systems — reactive:nvidia-vera-computex-launch
[28] NVIDIA to Use SK hynix and Samsung HBM4 for "Vera Rubin" Without Micron | TechPowerUp — reactive:nvidia-vera-computex-launch
[29] Why Nvidia Snubbed Micron For Samsung, SK Hynix - Dailymotion — reactive:hbm-memory-supply-squeeze
[30] NVIDIA CEO Jensen Huang at Dell Technologies World: ‘Demand Is Going Parabolic, Utterly Parabolic’ — NVIDIA Blog (2026-05-18)
[31] NVIDIA just dropped $81.6B in Q1 revenue up 85% YoY 🤯 — reactive:nvidia-vera-computex-launch (2026-05-21)
[32] Jensen Huang today: Memory demand >> supply chain capacity. “Supply chain needs to be ready.” AI memory supercycle... — reactive:nvidia-vera-computex-launch (2026-05-18)
[33] Meta Builds AI Infrastructure With NVIDIA — reactive:nvidia-vera-computex-launch
[34] NVIDIA GTC 2026: Google Cloud Deepens Partnership for AI ... — reactive:nvidia-vera-computex-launch
[35] Microsoft's strategic AI datacenter planning enables seamless, large ... — reactive:nvidia-vera-computex-launch
[36] Nscale and Microsoft Announce Collaboration with NVIDIA and Caterpillar to Deliver 1.35GW of NVIDIA Vera Rubin NVL72 GPUs at Flagship AI Factory Campus in West Virginia — reactive:nvidia-vera-computex-launch
[37] Nvidia-Backed Nscale Plans Huge Data Center Cluster in West ... — reactive:nvidia-vera-computex-launch
[38] Nscale to Deliver 66,000+ NVIDIA Rubin GPUs to Microsoft at Start Campus' Site in Portugal — reactive:nvidia-vera-computex-launch
[39] 66,000+ NVIDIA Rubin GPUs. 200MW of AI infrastructure. One ... — reactive:nvidia-vera-computex-launch
[40] Michael Dell, Jensen Huang: Boldest Statements From Dell Technologies World 2026 — reactive:nvidia-vera-computex-launch
[41] Supermicro Announces Support for Upcoming NVIDIA Vera Rubin ... — reactive:nvidia-vera-computex-launch
[42] Diags means "diagnostics". It's a set of scripts that check everything is working, from the OS software configuration to… — SemiAnalysis Twitter (2026-05-31)
[43] Micron Singapore - Facebook — reactive:nvidia-vera-computex-launch
[44] SK Hynix Secures 70% of Nvidia's HBM4 Orders - Semicon — reactive:nvidia-vera-computex-launch
[45] Samsung sells out of 2026 HBM4 supply as memory resurgence ... — reactive:aws-garman-a100-demand
[46] Nvidia's Vera Rubin GPU: Redesigning Data Centres for 600kW Racks — reactive:nvidia-vera-computex-launch
[47] 600kW racks in 2027 | Maciek Szadkowski — reactive:nvidia-vera-computex-launch
[48] Nscale acquisition includes plan to build AI facility in Mason County — reactive:nvidia-vera-computex-launch
[49] SK Hynix set to ship HBM4 for Nvidia's Vera Rubin this month — reactive:nvidia-vera-computex-launch
[50] Nscale and Microsoft Announce Collaboration with NVIDIA and Caterpillar to Deliver 1.35GW of NVIDIA Vera Rubin NVL72 GPUs at Flagship AI Factory Campus in West Virginia — reactive:nvidia-vera-computex-launch
[51] Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs — NVIDIA Blog (2026-05-18)
[52] NVIDIA hand-delivers first 1.2 TB/s Vera CPUs to OpenAI, Anthropic ... — reactive:nvidia-vera-computex-launch
[53] Nvidia unveils details of new 88-core Vera CPUs positioned to compete with AMD and Intel – new Vera CPU rack features 256 liquid-cooled chips that deliver up to a 6X gain in CPU throughput | Tom's Hardware — reactive:nvidia-vera-computex-launch
[54] NVIDIA AI - Jensen Huang Says “Buy Dell” - LinkedIn — reactive:nvidia-vera-computex-launch
[55] "Demand has gone parabolic. The reason is simple: Agentic AI has arrived." — reactive:nvidia-vera-computex-launch (2026-05-21)
[56] NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI — NVIDIA Blog (2026-05-21)
[57] NVIDIA Vera Rubin NVL72 wins Computex 2026 awards for AI ... — reactive:nvidia-vera-computex-launch
[58] Nscale raises $433m in pre-Series C funding, backed by Nvidia and others | Kai Nicol-Schwarz posted on the topic | LinkedIn — reactive:nvidia-vera-computex-launch
[59] NVIDIA Vera Rubin Deployment in Europe Announced — reactive:nvidia-vera-computex-launch
[60] NVIDIA GTC Taipei at COMPUTEX 2026 | June 1-5 — reactive:nvidia-vera-computex-launch