How fast is 10 tokens per second really?
Simon Willison · Simon Willison · 2026-05-20
Mike Veerman's open-source HTML tool lets users compare LLM token output speeds from 5 to 800 tokens per second interactively, providing concrete intuition for speed figures cited in model marketing.
Appears in
Extraction
Topics: llm-inferencedeveloper-toolstoken-generation-speed
Claims
- LLM token generation speeds are commonly advertised in tokens-per-second but are difficult to intuit without a visual reference.
- The tool simulates output at speeds ranging from 5 to 800 tokens per second.
- The tool's source code is publicly available.
Key quotes
Useful if you see a model advertised as '30 tokens/second' and want to get a feel for what that actually looks like.