Gemini 3.5 Flash: Release, Pricing, and Tooling · history
Version 4
2026-05-23 03:33 UTC · 86 items
What
Google launched Gemini 3.5 Flash at Google I/O 2026 on May 19, priced at $1.50/million input tokens and $9/million output tokens — 3x the cost of Gemini 3 Flash Preview [9][10]. The model replaced Gemini 3 as the default on Google's web interface, now powers Google Search AI Mode [6], and carries the tagline 'frontier intelligence with action' [1]. Google has since shifted to an enterprise cost-savings counter-narrative, with VentureBeat reporting Google's claim that the model can slash enterprise AI costs by more than $1 billion a year versus competing providers [16]. Independent benchmarks note the model scores within two points of Anthropic's flagship at a third of that flagship's price [15] — a cross-vendor frame that contrasts sharply with the developer community's experience of a 3x internal price hike.
Why it matters
Google's $1 billion enterprise savings claim introduces a new comparison baseline — Gemini 3.5 Flash positioned as cheap relative to GPT-5 or Claude flagship, not relative to prior Gemini Flash models. If enterprise buyers accept that frame, Google may defuse the backlash without conceding on price. The standoff between those two factually accurate but opposite-pointing comparisons will determine whether this model is remembered as an affordable frontier option or an expensive Flash replacement.
Open questions
Does Google's $1B+ enterprise savings claim hold up independently — and what is the comparison baseline: GPT-5, Claude Sonnet/Opus, or a broader average? [16]
Do the capability improvements over Gemini 3 Flash justify a 3x price increase, or are API customers now paying Pro prices for historically Flash-tier workloads? [14][13][15]
A developer forum thread documented sudden cost spikes on gemini-3-flash-preview despite decreased usage — suggesting pricing or billing behavior that developers do not yet fully understand; will Google clarify? [12]
When will Gemini 3.5 Pro arrive? It was rumored ahead of I/O and remains unannounced [25][26].
Narrative
Google launched Gemini 3.5 Flash at Google I/O 2026 on May 19, marketing it under the tagline 'frontier intelligence with action' [1][2]. The model was spotted in the Google Cloud Console hours before the official announcement [3][4], alongside a companion model called Gemini Omni Flash [5]. Upon launch, Google confirmed that Gemini 3.5 Flash had replaced Gemini 3 as the default model on its web interface, that it was powering AI Mode in Google Search [6], and that it featured in updated Google AI subscription offerings announced at I/O [7]. The Gemini API changelog [8] and Google's official model blog [1] formalized its release documentation.
The pricing landed at $1.50 per million input tokens and $9 per million output tokens — 3x the cost of Gemini 3 Flash Preview and 6x Gemini 3.1 Flash-Lite [9][10]. One analysis noted the output token price is nearly 30x more expensive than the previous cheapest Flash-tier option [11]. An independent benchmark run illustrated real-world cost impact: the Artificial Analysis benchmark cost $1,551.60 on Gemini 3.5 Flash (high), compared to $892.28 for Gemini 3.1 Pro Preview [9]. These figures nearly erase the price gap between Google's mid-tier and pro offerings, undermining the cost/capability segmentation that the 'Flash' label had previously signaled to developers. Developer frustration has expanded beyond sticker shock: a Google AI developer forum thread documented unexpected cost spikes on gemini-3-flash-preview despite decreased usage, suggesting billing behavior that developers do not yet fully understand [12].
On the capability side, early signals are favorable. Artificial Analysis rated Gemini 3.5 Flash as the 'clear leader on the Intelligence vs Speed Pareto frontier' with large benchmark gains [13]. An early-access user described first impressions as feeling 'on par with Gemini 3.1 Pro Preview' [14]. A headline from RD World characterized the model as scoring within two points of Anthropic's flagship at a third of that flagship's price [15] — a cross-vendor framing that presents the model as cheap relative to Claude rather than expensive relative to prior Flash offerings. Google has pressed this alternative frame aggressively: VentureBeat reports Google's claim that Gemini 3.5 Flash can slash enterprise AI costs by more than $1 billion a year compared to competing providers [16], repositioning the pricing debate entirely away from internal comparisons.
The developer community's pricing reaction was swift and unusually intense. Posts described the pricing as 'painful' [17], announced 'the good times are over' [18], and declared outright that 'Gemini 3.5 Flash is EXPENSIVE' [19]. Reddit threads continued characterizing the price hike as a significant negative surprise [20]. At least one voice remained broadly skeptical of Google's standing despite the launch, posting that 'Google is still behind' [21]. One observer stated they had 'never seen such negative feedback in a software release before' [22]. Analyst Simon Willison contextualized the increases as part of a broader industry repricing trend — Google, OpenAI, and Anthropic all appear to be simultaneously testing API customer price tolerance, suggesting the aggressive discounting of early LLM competition may be giving way to margin recapture [9]. On the tooling side, the llm CLI plugin for Gemini added gemini-3.5-flash support in version 0.32 [23], and a concurrent alpha introduced streaming reasoning token support [24].
Timeline
- 2026-05-17: Gemini 3.2 Flash-Lite-Live spotted in Google Cloud Console, signaling imminent I/O launches [31]
- 2026-05-19: Gemini 3.5 Flash and Gemini Omni Flash spotted in Google Cloud Console hours before official Google I/O announcement [32][3][4][5]
- 2026-05-19: Gemini 3.5 Flash officially launched at Google I/O 2026; priced at $1.50/M input, $9/M output — 3x cost of Gemini 3 Flash Preview; deployed as default on web interface and Google Search AI Mode; featured in updated Google AI subscription tiers [9][10][2][6][33][1][7][8][34]
- 2026-05-19: llm-gemini 0.32 released, adding the gemini-3.5-flash model identifier to the llm CLI plugin [23]
- 2026-05-19: llm-gemini 0.32a0 alpha released, adding streaming reasoning token support for Gemini models [24]
- 2026-05-19: Artificial Analysis benchmarks rate Gemini 3.5 Flash as 'clear leader on the Intelligence vs Speed Pareto frontier'; early-access user impressions put it on par with Gemini 3.1 Pro Preview [13][14]
- 2026-05-19: Developer community pricing backlash erupts across social media; posts describe pricing as 'painful', 'expensive', and mark the end of cheap Flash-tier access; migration and alternatives comparisons begin circulating [28][17][18][29][11][19][30][20]
- 2026-05-21: Community observer characterizes pricing reaction as worst negative feedback ever seen for a software release [22]
- 2026-05-22: Google's enterprise counter-narrative emerges: VentureBeat reports Google's claim that Gemini 3.5 Flash can cut enterprise AI costs by more than $1 billion a year vs. competing providers; benchmark coverage frames model as scoring within two points of Anthropic's flagship at a third of the price [16][15]
Perspectives
Simon Willison
Analytically skeptical of the price increases; documents the cost jump with benchmark data and frames it as part of an industry-wide repricing trend affecting all three major labs. Notes the irony of Google deploying a pricier model in free consumer products while charging API customers near-Pro rates.
Evolution: Consistent across all posts.
Artificial Analysis
Positive on capability: rates Gemini 3.5 Flash as the clear leader on the intelligence-vs-speed Pareto frontier with large benchmark gains, though their own cost data ($1,551.60 per benchmark run) illustrates the price burden.
Evolution: Consistent; provides the most structured capability endorsement of any observer.
Early-access user (Prompt/@engineerrprompt)
Positive on capability: first impressions describe Gemini 3.5 Flash as feeling on par with Gemini 3.1 Pro Preview, implicitly validating the capability-tier collapse.
Evolution: Consistent.
Positions Gemini 3.5 Flash as a flagship efficiency model with frontier-level intelligence, deploying it as the default in consumer products, Google Search AI Mode, and updated subscription tiers. Has since pivoted to an enterprise cost-savings counter-narrative, claiming the model can cut enterprise AI costs by more than $1 billion a year versus competing providers — reframing the pricing debate from internal comparison (vs. prior Flash) to cross-vendor comparison (vs. GPT-5/Claude).
Evolution: Expanded: in addition to promotional launch materials, Google is now actively counter-framing the pricing backlash with an enterprise savings narrative not present at launch.
Developer community (broad)
Strongly negative on pricing; characterize the 3x cost increase as a betrayal of the Flash-tier value proposition, with phrases like 'painful', 'the good times are over', and 'EXPENSIVE' dominating reaction posts. Migration research is underway, and practical billing anomalies on prior Flash models are compounding the frustration.
Evolution: Consistent with prior; Reddit discussion [20] and billing anomaly reports [12] confirm the reaction has not dissipated and is expanding to practical billing complaints.
Cross-vendor benchmark observers
Frame Gemini 3.5 Flash favorably relative to expensive flagship models from Anthropic and OpenAI, noting it scores within two points of Anthropic's flagship at a third of that flagship's price — a reframing that implicitly supports Google's enterprise cost narrative by shifting the reference point away from prior Gemini Flash models.
Evolution: New voice this pass; introduces a comparison baseline that diverges from the developer community's internal price comparison.
Tensions
- Artificial Analysis and early-access users rate Gemini 3.5 Flash as a genuine capability leader matching Gemini 3.1 Pro [13][14], while the developer community broadly rejects the pricing as unjustifiable for a Flash-tier model [17][18][20] — a standoff between benchmark performance and cost tolerance. [13][14][17][18][20]
- Google and cross-vendor benchmarks frame Gemini 3.5 Flash as cheap relative to GPT-5 and Claude flagship models [16][15], while API developers experience it as 3x more expensive than the prior Flash model they were paying for [9][10] — two pricing comparisons that are both factually accurate but point to opposite conclusions about affordability. [16][15][9][10]
- Google positions Gemini 3.5 Flash as a Flash-tier efficiency model, deploys it free in consumer products and updated subscription tiers [6][7], and promotes it as frontier intelligence [1] — while its API pricing nearly matches Gemini 3.1 Pro, creating an unresolved tension between the 'Flash' branding promise and the cost burden placed on API customers. [9][10][6][1][7]
Sources
- [1] Gemini 3.5: frontier intelligence with action - Google Blog — reactive:google-io-gemini-launch
- [2] Google shipped Gemini 3.5 Flash at I/O 2026. Marketing says "frontier intelligence with action." — reactive:gemini-35-flash-release (2026-05-20)
- [3] Gemini 3.5 Flash is now visible within the Google Cloud Console, preceding its official announcement at Google I/O. It a... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [4] 🧵 Gemini 3.5 Flash Appears in Google Cloud Console (May 19, 2026) — reactive:gemini-35-flash-release (2026-05-19)
- [5] Google DeepMind have released Gemini Omni Flash and Gemini 3.5 Flash — reactive:google-io-2026-launch-blitz (2026-05-19)
- [6] Gemini 3.5 Flash is officially live, replacing Gemini 3 as the default option on the web interface and powering Google S... — reactive:gemini-35-flash-release (2026-05-20)
- [7] Everything new in our Google AI subscriptions, fresh from I/O 2026 — reactive:google-io-gemini-launch
- [8] Release notes | Gemini API - Google AI for Developers — reactive:google-io-gemini-launch
- [9] Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — Simon Willison (2026-05-19)
- [10] 3.5 Flash pricing: $1.50/million input tokens, $9/million output. That's 3x Gemini 3 Flash Preview and 6x the 3.1 Flash-... — reactive:gemini-35-flash-release (2026-05-20)
- [11] Gemini 3.5 Flash now costs $9 per million output tokens , 3x the price of Gemini 3 Flash and nearly 30x more than Gemini... — reactive:gemini-35-flash-release (2026-05-19)
- [12] Sudden Cost Spike with gemini-3-flash-preview Despite Decreased ... — reactive:gemini-35-flash-release
- [13] Google’s new Gemini 3.5 Flash is the clear leader on the Intelligence vs Speed Pareto frontier and makes large gains on ... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [14] Gemini 3.5 Flash, first impressions. I had early access to the model, and it feels on par with Gemini 3.1 Pro Preview. T... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [15] Google's Gemini 3.5 Flash scores within two points of Anthropic's ... — reactive:gemini-35-flash-release
- [16] Google says Gemini 3.5 Flash can slash enterprise AI costs by more ... — reactive:gemini-35-flash-release
- [17] gemini 3.5 flash pricing is painful. — reactive:gemini-35-flash-release (2026-05-19)
- [18] Gemini 3.5 Flash pricing has changed - the good times are over! It’s now 3x more expensive than Gemini 3 Flash. https://... — reactive:gemini-35-flash-release (2026-05-19)
- [19] Gemini 3.5 Flash is EXPENSIVE. — reactive:gemini-35-flash-release (2026-05-19)
- [20] Google just dropped Gemini 3.5 Flash and the price hike is pretty ... — reactive:gemini-35-flash-release
- [21] Google is still behind. Previously I thought Gemini 3.5 Flash was ... — reactive:gemini-35-flash-release
- [22] @andyzhang @antigravity I never saw such a negative feedback in a software release before. Replacing Gemini 3 Flash quot... — reactive:google-io-2026-launch-blitz (2026-05-21)
- [23] llm-gemini 0.32 — Simon Willison (2026-05-19)
- [24] llm-gemini 0.32a0 — Simon Willison (2026-05-19)
- [25] Gemini 3.5 Pro rumor just got spicy 🔥 — reactive:gemini-35-flash-release (2026-05-19)
- [26] Gemini 3.5 Pro watch: Google's official Gemini API model page lists Gemini 3.1 Pro Preview and Gemini 3 Flash, but no ge... — reactive:gemini-35-flash-release (2026-05-19)
- [27] Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — reactive:gemini-35-flash-release
- [28] Gemini 3.5 Flash Pricing: 😢 — reactive:gemini-35-flash-release (2026-05-19)
- [29] GEMINI 3.5 FLASH MIGHT NOT BE CHEAP AT ALL — reactive:gemini-35-flash-release (2026-05-19)
- [30] Gemini 3.5 Flash vs Gemini 3 Flash Preview: Pricing & Migration ... — reactive:gemini-35-flash-release
- [31] UPDATE: Gemini 3.2 Flash-Lite-Live just spotted in the Google Cloud Console. — reactive:google-io-2026-launch-blitz (2026-05-17)
- [32] Gemini 3.5 Flash (In Console) and Gemini Omni are nearing release, with only two hours remaining until Google I/O. https... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [33] Google’s official pricing page has been updated to include Gemini 3.5 Flash. — reactive:gemini-35-flash-release (2026-05-19)
- [34] Google launches Gemini 3.5 Flash. How to try it for free. | Mashable — reactive:google-io-gemini-launch