AMD ALERT 🚀 MI355 is now 40% cheaper than B200 on GLM5 architecture for Single Node serving FP8 14 weeks after the initi…
SemiAnalysis Twitter · SemiAnalysis (@SemiAnalysis_) · 2026-05-19
AMD's MI355 GPU is now 40% cheaper than NVIDIA's B200 for single-node FP8 inference on GLM5 architecture using SGLang v0.12, a milestone reached just 14 weeks after GLM5's initial launch on both CUDA and ROCm.
Appears in
Extraction
Topics: amd-gpunvidia-gpuai-inferencehardware-benchmarksrocm
Claims
- AMD MI355 achieves 40% lower cost than NVIDIA B200 on GLM5 architecture for single-node FP8 serving.
- This result was achieved 14 weeks after GLM5's initial launch.
- The benchmark covers both MTP and non-MTP configurations with speculative decoding on SGLang v0.12.
- The result holds on both CUDA and ROCm backends.
Key quotes
AMD ALERT 🚀 MI355 is now 40% cheaper than B200 on GLM5 architecture for Single Node serving FP8 14 weeks after the initial launch of GLM5 on both non-MTP & MTP with spec decode for SGLang v0.12 for both CUDA & ROCm. SPEED IS THE MOAT!!