Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation

Ars Technica AI · Ryan Whitwam · 2026-06-09

Google launches Gemini 3.5 Live Translate, a speech-to-speech AI model supporting over 70 languages that matches speaker intonation and pacing for near-real-time voice translation.

Open original ↗

Appears in

Google I/O 2026: Gemini 3.5 and Agents-Everywhere Strategy

Extraction

Topics: real-time-translationspeech-to-speechgoogle-geminimultimodal-ai

Claims

Gemini 3.5 Live Translate is a speech-to-speech model that automatically detects and translates over 70 languages.
The model follows a normal conversation with only a few seconds of latency while matching the speaker's intonation, pacing, and pitch.
Gemini 3.5 Live Translate is part of the version 3.5 family announced at Google I/O, with a Pro variant expected in coming weeks.
Google has been pursuing real-time translation for years but previously required specific hardware like Pixel phones or Google earbuds.

Key quotes

Gemini 3.5 Live Translate is fast enough to keep up with a normal conversation, following just a few seconds behind the speaker while also matching intonation, pacing, and pitch.

The demos, which are all being recorded under controlled conditions, do sound impressive. You won't have to wait long to verify the model's abilities for yourself, though.