How fast is 10 tokens per second really?

Simon Willison · Simon Willison · 2026-05-20

Mike Veerman's open-source HTML tool lets users compare LLM token output speeds from 5 to 800 tokens per second interactively, providing concrete intuition for speed figures cited in model marketing.

Open original ↗

Appears in

LLM Inference Efficiency Research Cluster

Extraction

Topics: llm-inferencedeveloper-toolstoken-generation-speed

Claims

LLM token generation speeds are commonly advertised in tokens-per-second but are difficult to intuit without a visual reference.
The tool simulates output at speeds ranging from 5 to 800 tokens per second.
The tool's source code is publicly available.

Key quotes

Useful if you see a model advertised as '30 tokens/second' and want to get a feel for what that actually looks like.