Gemini 3.5 Flash: Release, Pricing, and Tooling · history
Version 6
2026-05-24 08:44 UTC · 167 items
What
Google launched Gemini 3.5 Flash at Google I/O 2026 on May 19 at $1.50/M input and $9/M output tokens — 3x the cost of Gemini 3 Flash Preview — while simultaneously making consumer AI subscription tiers cheaper [13][14][12]. The sticker-price debate has split into two valid but opposite framings: cheap relative to GPT-5 and Claude flagships [19], or expensive relative to prior Flash models [21]. The sharper story, however, is the API key billing-vulnerability track: a security researcher documented in April 2026 that Google's policy change converted previously-safe public API keys into billing liabilities [29], at least one developer was charged $10,138 in March 2026 with Google support closing the case twice saying 'no fraud found' [30], and a Cybernews piece frames the pattern as 'legacy API keys push Google Cloud developers to bankruptcy' [34]. Gemini 3.5 Pro is widely expected to launch in June based on community sources, with no official Google announcement [42][43].
Why it matters
The billing/security dimension has crossed from developer frustration into documented financial harm without recourse: Google support's 'no fraud found' refusal [30] signals that Google may treat policy-driven billing surprises as normal business behavior rather than errors. The simultaneous consumer price cut and API price hike suggests Google is deliberately cross-subsidizing consumer access from API developer revenue — a structural choice that widens the gap between Google's consumer and developer relationships and raises the question of who ultimately bears the cost of Google's AI buildout.
Open questions
Will Google provide refunds to developers charged via the API key vulnerability? Google support reportedly closed at least one $10,138 case saying 'no fraud found' [30] — what is Google's formal liability position on charges that stem from its own policy change?
The Equixly analysis [29] argues Google's policy change — not developer negligence — created the billing exposure; does Google accept any responsibility for developers who followed previously acceptable security practices?
How does the consumer subscription pricing cut at I/O 2026 [12] relate to the simultaneous 3x API price increase — is Google cross-subsidizing free consumer access from API developer revenue, as at least one analysis argues [28]?
When does Gemini 3.5 Pro launch? Community sources expect June 2026 [42][43], but Google has made no formal announcement.
Narrative
Google launched Gemini 3.5 Flash at Google I/O 2026 on May 19 under the tagline 'frontier intelligence with action' [1][2]. The model had been spotted in the Google Cloud Console hours before the official announcement [3][4], alongside a companion model called Gemini Omni Flash [5]. At launch, Gemini 3.5 Flash replaced Gemini 3 as the default on Google's web interface and began powering AI Mode in Google Search [6]. Official documentation — including 'What's New' API guides [7][8], a Google Cloud Enterprise Agent Platform listing [9], and Google DeepMind's model card [10] — formalized its capabilities. In a notable asymmetry, Mashable reported that Google AI consumer subscription tiers became cheaper at I/O, not more expensive [11][12], even as API prices rose sharply.
API pricing landed at $1.50 per million input tokens and $9 per million output tokens — 3x the cost of Gemini 3 Flash Preview and up to 6x the cheapest prior Flash option [13][14]. Artificial Analysis rated Gemini 3.5 Flash as the 'clear leader on the Intelligence vs Speed Pareto frontier' [15][16], and their benchmark run cost $1,551.60 versus $892.28 for Gemini 3.1 Pro Preview [13], nearly collapsing the historical price gap between Flash and Pro tiers. A Towards AI piece testing the model on 18 agent tasks found it outperformed GPT-5.5 at 4x the speed [17], and early-access users described impressions as on par with Gemini 3.1 Pro Preview [18]. Google and aligned voices pressed a cross-vendor reframing — Gemini 3.5 Flash is cheap relative to GPT-5 and Claude flagships, with VentureBeat reporting Google's claim that it can cut enterprise AI costs by more than $1 billion a year versus competing providers [19][20]. The Decoder framed the price increase as Google following Anthropic and OpenAI in 'making newer AI models significantly pricier' [21], lending industry-context support to analyst Tomasz Tunguz's 'The Unsustainable Subsidy' thesis that the era of below-cost AI access is ending across all major labs [22][23]. Developer communities characterized the increase as a betrayal of the Flash-tier value proposition [24][25][26], with YouTube commentary explicitly labeling it 'the hidden cost' [27]. Remio.ai argued that by giving consumers free Flash access, Google is not selling the model but building ecosystem engagement — framing the consumer-API pricing gap as strategic rather than accidental [28].
The billing-anomaly story has escalated into a documented security and liability crisis that predates Gemini 3.5 Flash's launch. A security researcher at Equixly published analysis on April 2, 2026 — weeks before Google I/O — arguing that Google's policy change had converted previously-safe public API keys into billing liabilities, effectively turning a formerly acceptable security posture into catastrophic financial exposure [29]. A Reddit post in r/googlecloud documents a developer charged $10,138 in March 2026 via this vulnerability, with Google support closing the case twice saying 'no fraud found' and offering no refund [30]. A separate Hacker News thread documents a €54,000 billing spike in 13 hours from an unrestricted Firebase browser key [31][32][33]. Cybernews framed the pattern as 'legacy API keys push Google Cloud developers to bankruptcy' [34], and a Reddit thread attributed the charges directly to 'Google's policy change' rather than developer negligence [35]. A separate forum thread documents urgent API cost increases dating to March 16-17, 2026 [36] and a Medium piece described price hikes that 'never showed up on the pricing page' yet caused bills to rise 27% [37]. Advisory pieces from third-party consultancies now warn developers to prepare for upcoming Gemini API billing tier changes [38][39]. Taken together, these incidents — spanning March through May 2026 — suggest the billing problem is structural and predates the new model rollout.
On tooling, the llm CLI plugin added gemini-3.5-flash support in version 0.32 [40] with a concurrent alpha adding streaming reasoning token support [41]. Gemini 3.5 Pro remains unannounced; both a wavespeed.ai blog post and a Reddit thread expect it to arrive in June 2026 [42][43]. A YouTube video framing what the Flash release signals about the forthcoming Pro model attracted significant attention [44], and a review video simply titled 'Gemini 3.5 Flash is just... fine' [45] captured the muted enthusiasm that coexists with pricing frustration in the developer community.
Timeline
- 2026-03: Developer charged $10,138 via Gemini API key vulnerability; Google support closes case twice saying 'no fraud found' and refuses refund [30]
- 2026-03-16: Developer forum post documents urgent API cost increases beginning March 16-17, 2026 — weeks before the Gemini 3.5 Flash launch [36]
- 2026-04-02: Equixly publishes security analysis documenting how Google's policy change converted previously-safe public API keys into billing liabilities [29]
- 2026-05-17: Gemini 3.2 Flash-Lite-Live spotted in Google Cloud Console, signaling imminent I/O launches [64]
- 2026-05-19: Gemini 3.5 Flash and Gemini Omni Flash spotted in Google Cloud Console hours before official Google I/O announcement [65][3][4][5]
- 2026-05-19: Gemini 3.5 Flash officially launched at Google I/O 2026; priced at $1.50/M input, $9/M output — 3x cost of Gemini 3 Flash Preview; deployed as default on web interface and Google Search AI Mode; Google AI consumer subscription tiers made cheaper at I/O; official model documentation and API changelog published [13][14][2][6][66][1][10][11][49][67][7][9][8][12]
- 2026-05-19: llm-gemini 0.32 released, adding the gemini-3.5-flash model identifier to the llm CLI plugin; concurrent alpha adds streaming reasoning token support [40][41]
- 2026-05-19: Artificial Analysis benchmarks rate Gemini 3.5 Flash as 'clear leader on the Intelligence vs Speed Pareto frontier'; early-access user impressions put it on par with Gemini 3.1 Pro Preview; Towards AI 18-agent-task test finds it outperforms GPT-5.5 at 4x speed [15][18][16][17]
- 2026-05-19: Developer community pricing backlash erupts across social media and forums; posts describe pricing as 'painful' and mark the end of cheap Flash-tier access; migration and alternatives research begins [51][24][25][52][53][54][55][56][26]
- 2026-05-21: Community observer characterizes pricing reaction as the worst negative feedback ever seen for a software release; tweet notes 'Gemini 3.5 Flash beats last year's Pro. The pricing conversation tells the rest.' [50][68]
- 2026-05-22: Google's enterprise counter-narrative emerges: VentureBeat reports Google's claim that Gemini 3.5 Flash can cut enterprise AI costs by more than $1 billion a year vs. competing providers; AI Business echoes enterprise efficiency angle; benchmark coverage frames model as scoring within two points of Anthropic's flagship at a third of the price [19][60][20]
- 2026-05-23: Scale Advantage posts that Gemini 3.5 Pro is coming next month; developer calls publicly for lower pricing; Tomasz Tunguz publishes 'The Unsustainable Subsidy,' framing industry repricing as the end of an AI land-grab subsidy era [69][59][22][23]
- 2026-05-24: Cybernews frames pattern of API key billing exploits as 'legacy API keys push Google Cloud developers to bankruptcy'; Reddit documents charges attributed directly to Google's policy change; multiple sources confirm €54k Firebase key incident and Google support's refusal to acknowledge fraud [34][35][31][32][33][30]
Perspectives
Simon Willison
Analytically skeptical of the price increases; documents the cost jump with benchmark data and frames it as part of an industry-wide repricing trend affecting all three major labs. Notes the irony of Google deploying a pricier model in free consumer products while charging API customers near-Pro rates.
Evolution: Consistent across all posts.
Tomasz Tunguz
Frames current AI API pricing as the end of an unsustainable subsidized land-grab era; the repricing across Google, OpenAI, and Anthropic reflects labs moving toward margin recapture after years of below-cost access.
Evolution: Consistent; multiple LinkedIn and Twitter amplifications reinforce the same 'Unsustainable Subsidy' thesis without adding new substance.
Artificial Analysis
Positive on capability: rates Gemini 3.5 Flash as the clear leader on the intelligence-vs-speed Pareto frontier with large benchmark gains. Their own cost data ($1,551.60 per benchmark run vs. $892.28 for Gemini 3.1 Pro Preview) illustrates the price burden even as they endorse the performance.
Evolution: Consistent.
Equixly (security research)
Documents the technical mechanism: Google's policy change converted previously-acceptable public API key deployment into a billing liability, framing developer exposure as a consequence of Google's undisclosed policy shift rather than developer negligence.
Evolution: New voice this pass; provides the most technically grounded account of how the billing vulnerability arose and why responsibility may rest with Google rather than developers.
The Decoder
Frames Gemini 3.5 Flash's price increase as part of a deliberate industry trend — Google follows Anthropic and OpenAI in making newer models significantly pricier — rather than an isolated Google decision.
Evolution: New voice this pass; reinforces the Tunguz/Willison repricing thesis from a news-analysis perspective.
Positions Gemini 3.5 Flash as a flagship efficiency model with frontier-level intelligence; deployed it as the default in consumer products and made consumer subscription tiers cheaper at I/O. Has pressed an enterprise cost-savings counter-narrative, claiming the model can cut enterprise AI costs by more than $1 billion a year versus competing providers. Has not formally responded to the API key billing vulnerability story; Google support has reportedly closed billing dispute cases saying 'no fraud found,' an implicit position that developers bear responsibility for policy-driven billing exposure.
Evolution: Extended: the consumer subscription price cut adds a cross-subsidy dimension — Google may be funding cheaper consumer access from API revenue. The 'no fraud found' support responses constitute an implicit liability stance that has not been formally articulated.
Developer community (broad)
Strongly negative on pricing; characterize the 3x cost increase as a betrayal of the Flash-tier value proposition. Billing anomaly complaints have escalated to documented financial harm: a $10,138 charge with no refund, a €54k Firebase exposure, and charges dating to March 2026 from a Google policy change developers were not warned about. Google support's refusal to grant refunds has deepened distrust beyond sticker-price frustration.
Evolution: Escalated: the refund denial evidence transforms the billing narrative from opacity to active, uncompensated financial harm — a potential platform trust crisis distinct from the pricing debate.
Cross-vendor benchmark observers
Frame Gemini 3.5 Flash favorably relative to expensive flagship models from Anthropic and OpenAI — scoring within two points of Anthropic's flagship at a third of the price — implicitly supporting Google's enterprise cost narrative by shifting the reference point away from prior Gemini Flash models.
Evolution: Consistent; production and coding-agent guides add practical agentic-workload framing.
Tensions
- Artificial Analysis and early-access users rate Gemini 3.5 Flash as a genuine capability leader matching or exceeding Gemini 3.1 Pro [15][18][16], while the developer community broadly rejects the pricing as unjustifiable for a Flash-tier model [24][25][26] — a standoff between benchmark performance and cost tolerance. [15][18][16][24][25][26]
- Google and cross-vendor benchmarks frame Gemini 3.5 Flash as cheap relative to GPT-5 and Claude flagship models [19][60][20], while API developers experience it as 3x more expensive than the prior Flash model they were paying for [13][14] — two pricing comparisons that are both factually accurate but point to opposite conclusions about affordability. [19][60][20][13][14]
- Google made consumer AI subscription tiers cheaper at I/O 2026 [12] while simultaneously raising API prices 3x [13][14] — a divergence that suggests Google may be cross-subsidizing consumer access from API developer revenue [28], framing the Flash-tier price hike as a deliberate structural choice rather than cost-reflective pricing. [12][13][14][28]
- Equixly's security analysis argues that Google's own policy change created the billing liability for developers using public API keys [29], while Google support closed a developer's $10,138 billing dispute saying 'no fraud found' [30] — an implicit disagreement about whether the billing exposure is Google's responsibility or the developer's. [29][30][34][35]
- Google's stable public pricing pages imply a predictable billing environment, but developer forum threads document cost increases dating to March 2026 [36], a €54k Firebase key exposure [31], and hidden price hikes not reflected on the pricing page [37] — a tension between Google's official pricing narrative and the billing opacity developers are actually experiencing. [36][31][37][57][58]
Sources
- [1] Gemini 3.5: frontier intelligence with action - Google Blog — reactive:google-io-gemini-launch
- [2] Google shipped Gemini 3.5 Flash at I/O 2026. Marketing says "frontier intelligence with action." — reactive:gemini-35-flash-release (2026-05-20)
- [3] Gemini 3.5 Flash is now visible within the Google Cloud Console, preceding its official announcement at Google I/O. It a... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [4] 🧵 Gemini 3.5 Flash Appears in Google Cloud Console (May 19, 2026) — reactive:gemini-35-flash-release (2026-05-19)
- [5] Google DeepMind have released Gemini Omni Flash and Gemini 3.5 Flash — reactive:google-io-2026-launch-blitz (2026-05-19)
- [6] Gemini 3.5 Flash is officially live, replacing Gemini 3 as the default option on the web interface and powering Google S... — reactive:gemini-35-flash-release (2026-05-20)
- [7] What's new in Gemini 3.5 Flash - Interactions API — reactive:google-io-2026-gemini
- [8] What's new in Gemini 3.5 Flash - generateContent API — reactive:google-io-2026-launch-blitz
- [9] Gemini 3.5 Flash | Gemini Enterprise Agent Platform — reactive:google-io-2026-gemini
- [10] Gemini 3.5 Flash - Model Card - Google DeepMind — reactive:google-io-gemini-launch
- [11] Everything new in our Google AI subscriptions, fresh from I/O 2026 — reactive:google-io-gemini-launch
- [12] Google I/O 2026: AI subscription tiers are now cheaper | Mashable — reactive:gemini-35-flash-release
- [13] Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — Simon Willison (2026-05-19)
- [14] 3.5 Flash pricing: $1.50/million input tokens, $9/million output. That's 3x Gemini 3 Flash Preview and 6x the 3.1 Flash-... — reactive:gemini-35-flash-release (2026-05-20)
- [15] Google’s new Gemini 3.5 Flash is the clear leader on the Intelligence vs Speed Pareto frontier and makes large gains on ... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [16] Gemini 3.5 Flash: The new leader in intelligence versus speed — reactive:google-io-gemini-launch
- [17] I Tested Gemini 3.5 Flash on 18 Agent Tasks — Google's 6× Pricier ... — reactive:google-io-2026-gemini
- [18] Gemini 3.5 Flash, first impressions. I had early access to the model, and it feels on par with Gemini 3.1 Pro Preview. T... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [19] Google says Gemini 3.5 Flash can slash enterprise AI costs by more ... — reactive:gemini-35-flash-release
- [20] Google Aims at Enterprise Cost Efficiency With Gemini 3.5 Flash — reactive:gemini-35-flash-release
- [21] Google's Gemini 3.5 Flash follows Anthropic and OpenAI in making ... — reactive:gemini-35-flash-release
- [22] The Unsustainable Subsidy | Tomasz Tunguz — reactive:codex-practical-dev-tool
- [23] The subsidy era is over. Three years of AI pricing data tells the story. — reactive:gemini-35-flash-release
- [24] gemini 3.5 flash pricing is painful. — reactive:gemini-35-flash-release (2026-05-19)
- [25] Gemini 3.5 Flash pricing has changed - the good times are over! It’s now 3x more expensive than Gemini 3 Flash. https://... — reactive:gemini-35-flash-release (2026-05-19)
- [26] Gemini 3.5 flash costs 3 times more than the previous version and ... — reactive:gemini-35-flash-release
- [27] Gemini 3.5: The Hidden Cost of the Flash Model (It's Not the Budget Model) — reactive:gemini-35-flash-release
- [28] Google Made Its Fastest AI Model Free. Here's What It's Really Selling. — reactive:gemini-35-flash-release
- [29] How Gemini turned public Google API keys into secrets | Equixly — reactive:gemini-35-flash-release
- [30] Charged $10,138 in March 2026 due to Google's documented Gemini API key vulnerability — support closed my case twice saying "no fraud found" : r/googlecloud — reactive:gemini-35-flash-release
- [31] €54k spike in 13h from unrestricted Firebase browser key accessing ... — reactive:gemini-35-flash-release
- [32] Unexpected €54k billing spike in 13 hours: Firebase browser key without API restrictions used for Gemini requests - Gemini API - Google AI Developers Forum — reactive:gemini-35-flash-release
- [33] Developer hit with €54,000 Firebase bill due to exposed API key — reactive:gemini-35-flash-release
- [34] Legacy API keys push Google Cloud developers to bankruptcy — reactive:gemini-35-flash-release
- [35] Huge charges via GeminiAPI exploited due to googles policy change — reactive:gemini-35-flash-release
- [36] URGENT: The cost of API access has increased since March 16-17, 2026 - Gemini API - Google AI Developers Forum — reactive:gemini-35-flash-release
- [37] The AI price hike that never showed up on the pricing page ... — reactive:gemini-35-flash-release
- [38] Google Gemini API Billing Tier Changes 2026 - LaoZhang AI Blog — reactive:gemini-35-flash-release
- [39] Don’t Get Caught Off Guard: Prepare Now for Upcoming Gemini API Billing Changes - The Wursta Corporation — reactive:gemini-35-flash-release
- [40] llm-gemini 0.32 — Simon Willison (2026-05-19)
- [41] llm-gemini 0.32a0 — Simon Willison (2026-05-19)
- [42] Gemini 3.5 Pro Is Coming Next Month — What the Flash Release ... — reactive:google-io-gemini-launch
- [43] Gemini 3.5 Pro coming in June : r/Bard - Reddit — reactive:google-io-2026-gemini
- [44] Google's New AI Update Just Shocked The AI Industry - Gemini 3.5 Pro ... — reactive:gemini-35-flash-release
- [45] Gemini 3.5 Flash is just... fine - YouTube — reactive:google-io-gemini-launch
- [46] Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — reactive:gemini-35-flash-release
- [47] The Unsustainable Subsidy - LinkedIn — reactive:gemini-35-flash-release
- [48] The Unsustainable Subsidy | Tomasz Tunguz - LinkedIn — reactive:gemini-35-flash-release
- [49] Release notes | Gemini API - Google AI for Developers — reactive:google-io-gemini-launch
- [50] @andyzhang @antigravity I never saw such a negative feedback in a software release before. Replacing Gemini 3 Flash quot... — reactive:google-io-2026-launch-blitz (2026-05-21)
- [51] Gemini 3.5 Flash Pricing: 😢 — reactive:gemini-35-flash-release (2026-05-19)
- [52] GEMINI 3.5 FLASH MIGHT NOT BE CHEAP AT ALL — reactive:gemini-35-flash-release (2026-05-19)
- [53] Gemini 3.5 Flash now costs $9 per million output tokens , 3x the price of Gemini 3 Flash and nearly 30x more than Gemini... — reactive:gemini-35-flash-release (2026-05-19)
- [54] Gemini 3.5 Flash is EXPENSIVE. — reactive:gemini-35-flash-release (2026-05-19)
- [55] Gemini 3.5 Flash vs Gemini 3 Flash Preview: Pricing & Migration ... — reactive:gemini-35-flash-release
- [56] Google just dropped Gemini 3.5 Flash and the price hike is pretty ... — reactive:gemini-35-flash-release
- [57] Sudden Cost Spike with gemini-3-flash-preview Despite Decreased ... — reactive:gemini-35-flash-release
- [58] Sudden Cost Spike with gemini-3-flash-preview Despite Decreased Usage (April 2026) - Page 2 - Gemini API - Google AI Developers Forum — reactive:gemini-35-flash-release
- [59] @kr0der For now, I’d really like to see lower pricing for — reactive:gemini-35-flash-release (2026-05-23)
- [60] Google's Gemini 3.5 Flash scores within two points of Anthropic's ... — reactive:gemini-35-flash-release
- [61] @GeminiApp 3.5 Flash just dropped at Google I/O. It outperforms Gemini 3.1 Pro across almost all benchmarks — and runs ... — reactive:gemini-35-flash-release (2026-05-24)
- [62] Gemini 3.5 Flash for Coding Agents: Cost & Production Guide (2026) — reactive:gemini-35-flash-release
- [63] Gemini 3.5 Flash vs GPT-5.5 vs Claude Opus 4.7 Compared | Lushbinary — reactive:gemini-35-flash-release
- [64] UPDATE: Gemini 3.2 Flash-Lite-Live just spotted in the Google Cloud Console. — reactive:google-io-2026-launch-blitz (2026-05-17)
- [65] Gemini 3.5 Flash (In Console) and Gemini Omni are nearing release, with only two hours remaining until Google I/O. https... — reactive:google-io-2026-launch-blitz (2026-05-19)
- [66] Google’s official pricing page has been updated to include Gemini 3.5 Flash. — reactive:gemini-35-flash-release (2026-05-19)
- [67] Google launches Gemini 3.5 Flash. How to try it for free. | Mashable — reactive:google-io-gemini-launch
- [68] Gemini 3.5 Flash beats last year's Pro. The pricing conversation tells the rest. — reactive:gemini-35-flash-release (2026-05-21)
- [69] Gemini 3.5 pro coming out next month and might be amazing. But the only major data point since Gemini 3 release 6mo ago ... — reactive:google-io-2026-launch-blitz (2026-05-23)