Developers found a cheaper way to feed Fable 5 large context by showing it pictures of text.

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-07-04

A tool called pxpipe cuts Fable 5 inference costs by roughly 60% by rendering dense text as PNG images, exploiting the model's fixed per-image vision token cost to fit up to 92K characters of code into a single ~4,761-token image block.

Open original ↗

Appears in

LLM Inference Efficiency: Phase, Layer, and Time Splitting Strategies Driving Cost Compression

Extraction

Topics: llm-inference-costvision-language-modelscontext-compressiontoken-optimization

Claims

pxpipe renders text-heavy context—code, logs, chat history—into PNG images before sending them to Fable 5 as vision inputs.
A 1928×1928 image costs approximately 4,761 vision tokens regardless of how much readable text is packed into it.
The same image can hold roughly 92K characters of dense code, making large-context inference substantially cheaper than equivalent text tokens.
The technique achieves approximately 60% cost reduction on Fable 5 according to the originating developer.
The approach is lossy: Fable 5 may misread exact strings, hashes, or IDs, making it unsuitable for byte-exact facts.

Key quotes

~60% Fable cost cut by transparently turning the code into an image and having the model OCR it. WILD idea. also hilarious.

a 1928×1928 image costs about 4,761 vision tokens. The same page can hold roughly 92K characters, so dense code becomes cheaper.

The catch is that this is compression through vision, not lossless text storage.