NVIDIA Launches Vera CPU and Vera Rubin NVL72 at COMPUTEX / GTC Taipei v1

What

NVIDIA is executing a coordinated hardware offensive centered on two products: the Vera CPU — its first processor purpose-built for agentic AI — which began shipping to leading AI labs on May 18 [1], and the Vera Rubin NVL72, a next-generation inference platform claiming 10x lower cost-per-token than the current Blackwell generation [2][3]. Jensen Huang delivered keynotes at both Dell Technologies World and GTC Taipei at COMPUTEX, framing the launches as proof that AI demand has entered a 'parabolic' phase [2]. The announcements span enterprise on-premises infrastructure, edge AI with Jetson Thor, and autonomous vehicles via the Alpamayo platform [3].

Why it matters

If NVIDIA's performance claims hold, the Vera Rubin NVL72 represents a step-change in inference economics that could accelerate enterprise adoption of large agentic workloads. The simultaneous emphasis on on-premises deployment — given that 67% of AI workloads now reportedly run outside the cloud [2] — signals NVIDIA is positioning itself to capture a infrastructure buildout that Huang estimates could reach $3–4 trillion by 2030 [2].

Open questions

The 10x cost-per-token and 50% faster agentic workload claims come entirely from NVIDIA's own promotional materials [2][3] — when will independent benchmarks validate or challenge these figures?

Vera CPU has begun shipping to 'top AI labs' [1], but broad commercial availability timelines for both Vera CPU and Vera Rubin NVL72 have not been publicly disclosed — what is the ramp schedule?

The Groq 3 LPX co-deployment claim of 35x higher throughput per watt for trillion-parameter models [3] is unusually specific — what does this partnership entail and how does it affect standalone Vera Rubin pricing?

With AI infrastructure spending projected to grow 3,400% in token consumption by 2030 [2], how are AMD, Intel, and custom silicon (Google TPU, Amazon Trainium) positioned to contest NVIDIA's dominance in the agentic inference tier?

Narrative

NVIDIA's May 2026 product cycle represents the company's most explicit pivot toward 'agentic AI' as a hardware design target. The Vera CPU, now in early delivery to leading AI labs, is described as NVIDIA's first processor architected specifically for the memory-bandwidth demands of autonomous agent workloads [1]. With 1.2 TB/s of memory bandwidth, the Vera CPU is claimed to complete agentic tasks 50% faster than comparable x86 processors [2] — a direct shot at the Intel and AMD server CPUs that dominate existing data center deployments.

The larger announcement is the Vera Rubin NVL72, NVIDIA's next large-scale inference platform. NVIDIA claims it delivers 10x higher inference performance per watt and 10x lower cost-per-token compared to the Blackwell generation [2][3]. An optional integration with NVIDIA's Groq 3 LPX co-processor pushes that figure to 35x higher throughput per watt for trillion-parameter models [3]. The system's physical design has also been rethought: a cable-free, hose-free, fanless modular tray that reduces assembly time from two hours to five minutes per compute tray [3], a detail aimed squarely at hyperscale data center operators managing thousands of units.

Jensen Huang's keynote at Dell Technologies World provided the demand narrative to accompany the hardware. Huang argued that AI has entered the 'era of useful AI,' characterized by demand going 'parabolic, utterly parabolic' [2]. He cited a projection that worldwide AI infrastructure spending could reach $3–4 trillion by 2030, driven by a 3,400% growth in token consumption [2]. Notably, NVIDIA also emphasized that 67% of AI workloads now run outside the cloud — on-premises, at the edge, or in colocation facilities — framing the Dell AI Factory partnership as the canonical enterprise response to that shift [2]. NVIDIA Confidential Computing was highlighted as a mechanism allowing enterprises to run frontier models on-premises without exposing model weights or sensitive data [2].

At GTC Taipei during COMPUTEX, NVIDIA broadened the product story beyond data center inference. The Jetson Thor module for edge robotics delivers 7.5x the compute and 3.5x the energy efficiency of the prior Jetson Orin generation within a 40–130 watt form factor [3]. For autonomous driving, the Alpamayo platform pairs 10-billion-parameter chain-of-thought vision-language-action models with over 1,700 hours of open autonomous driving datasets, designed specifically for rare 'long-tail' edge cases such as ambiguous pedestrian hand signals or conflicting traffic signals [3]. All three items covering these announcements originate from NVIDIA's own blog, meaning the competitive performance claims remain unverified by independent third-party benchmarks.

Timeline

2026-05-18: NVIDIA begins shipping Vera CPUs to top AI labs [1]

2026-05-18: Jensen Huang keynotes at Dell Technologies World, announcing Vera Rubin NVL72 specs and projecting $3–4 trillion AI infrastructure buildout by 2030 [2]

2026-05-21: NVIDIA GTC Taipei at COMPUTEX: Vera Rubin NVL72, Jetson Thor, and Alpamayo autonomous driving platform detailed [3]

Perspectives

NVIDIA / Jensen Huang

Maximally bullish: the agentic AI era has definitively arrived, demand is 'parabolic,' the Vera CPU and Vera Rubin NVL72 are generational leaps that will reshape enterprise and edge AI economics, and the trajectory is so steep that eliminating disease may become conceivable within a generation.

Evolution: Consistent with prior Huang keynote framing but intensity has increased — the 'parabolic' language and $3–4 trillion infrastructure projection represent an escalation in confidence relative to earlier AI cycle claims.

[2][1][3]

Tensions

All performance claims (10x cost-per-token, 50% faster agentic workloads, 35x throughput per watt with Groq 3 LPX) originate exclusively from NVIDIA promotional sources with no independent corroboration in this thread — creating an implicit tension between NVIDIA's aggressive benchmarking narrative and the absence of third-party validation. [2][3]

Version 1