Google just made Gemma 4 much easier to run on phones and laptops by releasing QAT (Quantization-Aware Training) checkpo…

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-05

Google releases Quantization-Aware Training checkpoints for Gemma 4 that shrink the smallest model from 11.4 GB to 1.1 GB (0.84 GB text-only), making it practical to run on consumer phones and laptops.

Open original ↗

Appears in

Google I/O 2026: Gemini 3.5 and Agents-Everywhere Strategy

Extraction

Topics: model-quantizationon-device-aigemmaedge-deployment

Claims

Google released QAT checkpoints for Gemma 4 that reduce the smallest model's footprint from 11.4 GB to 1.1 GB.
A text-only variant of the quantized model reaches 0.84 GB, enabling deployment on consumer phones and laptops.
QAT preserves more model quality than standard post-training quantization because compression is incorporated during training rather than applied afterward.

Key quotes

Google just made Gemma 4 much easier to run on phones and laptops by releasing QAT (Quantization-Aware Training) checkpoints that shrink the smallest model from 11.4GB to 1.1GB, or 0.84GB for text-only use.