Apple working to cram massive Gemini model into iPhone to power new Siri

Ars Technica AI · Ryan Whitwam · 2026-05-28

Apple is reportedly integrating Google's Gemini into a revamped Siri that will rely heavily on Google and Nvidia cloud infrastructure rather than on-device processing, marking an apparent retreat from Apple's stated privacy-first AI strategy.

Open original ↗

Appears in

Google I/O 2026: Gemini 3.5 and Agents-Everywhere Strategy

Extraction

Topics: apple-sirion-device-aigemini-aiapple-google-partnershipai-privacy

Claims

Apple's Gemini-infused Siri will run both on-device and in the cloud, not primarily locally as Apple had indicated.
Apple has delayed its AI-enhanced Siri multiple times since first announcing it in 2024.
The cloud backend for the new Siri will involve both Google and Nvidia infrastructure.
Smartphones currently lack sufficient RAM to keep large AI models in memory, limiting on-device AI capabilities.
Apple's Neural Engine is optimized for contextual, efficient AI tasks rather than running large generative models.

Key quotes

Apple's Gemini-infused Siri will run both on-device and in the cloud, an apparent reversal of its privacy-focused preference for local AI.

Even if phones had faster AI processing, they lack the RAM to keep enormous models in memory.

Apple fans may not like the outcome, though.