Apple working to cram massive Gemini model into iPhone to power new Siri
Ars Technica AI · Ryan Whitwam · 2026-05-28
Apple is reportedly integrating Google's Gemini into a revamped Siri that will rely heavily on Google and Nvidia cloud infrastructure rather than on-device processing, marking an apparent retreat from Apple's stated privacy-first AI strategy.
Appears in
Extraction
Topics: apple-sirion-device-aigemini-aiapple-google-partnershipai-privacy
Claims
- Apple's Gemini-infused Siri will run both on-device and in the cloud, not primarily locally as Apple had indicated.
- Apple has delayed its AI-enhanced Siri multiple times since first announcing it in 2024.
- The cloud backend for the new Siri will involve both Google and Nvidia infrastructure.
- Smartphones currently lack sufficient RAM to keep large AI models in memory, limiting on-device AI capabilities.
- Apple's Neural Engine is optimized for contextual, efficient AI tasks rather than running large generative models.
Key quotes
Apple's Gemini-infused Siri will run both on-device and in the cloud, an apparent reversal of its privacy-focused preference for local AI.
Even if phones had faster AI processing, they lack the RAM to keep enormous models in memory.
Apple fans may not like the outcome, though.