Google just made Gemma 4 much easier to run on phones and laptops by releasing QAT (Quantization-Aware Training) checkpo…
Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-05
Google releases Quantization-Aware Training checkpoints for Gemma 4 that shrink the smallest model from 11.4 GB to 1.1 GB (0.84 GB text-only), making it practical to run on consumer phones and laptops.
Extraction
Topics: model-quantizationon-device-aigemmaedge-deployment
Claims
- Google released QAT checkpoints for Gemma 4 that reduce the smallest model's footprint from 11.4 GB to 1.1 GB.
- A text-only variant of the quantized model reaches 0.84 GB, enabling deployment on consumer phones and laptops.
- QAT preserves more model quality than standard post-training quantization because compression is incorporated during training rather than applied afterward.
Key quotes
Google just made Gemma 4 much easier to run on phones and laptops by releasing QAT (Quantization-Aware Training) checkpoints that shrink the smallest model from 11.4GB to 1.1GB, or 0.84GB for text-only use.