AMD ALERT 🚀 MI355 is now 40% cheaper than B200 on GLM5 architecture for Single Node serving FP8 14 weeks after the initi…

SemiAnalysis Twitter · SemiAnalysis (@SemiAnalysis_) · 2026-05-19

AMD's MI355 GPU is now 40% cheaper than NVIDIA's B200 for single-node FP8 inference on GLM5 architecture using SGLang v0.12, a milestone reached just 14 weeks after GLM5's initial launch on both CUDA and ROCm.

Open original ↗

Appears in

AMD and Google TPU Closing the Gap on NVIDIA

Extraction

Topics: amd-gpunvidia-gpuai-inferencehardware-benchmarksrocm

Claims

AMD MI355 achieves 40% lower cost than NVIDIA B200 on GLM5 architecture for single-node FP8 serving.
This result was achieved 14 weeks after GLM5's initial launch.
The benchmark covers both MTP and non-MTP configurations with speculative decoding on SGLang v0.12.
The result holds on both CUDA and ROCm backends.

Key quotes

AMD ALERT 🚀 MI355 is now 40% cheaper than B200 on GLM5 architecture for Single Node serving FP8 14 weeks after the initial launch of GLM5 on both non-MTP & MTP with spec decode for SGLang v0.12 for both CUDA & ROCm. SPEED IS THE MOAT!!