Gemini 3.5 Flash: Release, Pricing, and Tooling · history

Version 5

2026-05-24 02:44 UTC · 137 items

Changes since v4

The billing-anomaly story has expanded into a distinct narrative: a €54k billing spike from an unrestricted Firebase browser key [^13217] adds a security-exposure dimension, a separate forum thread reveals API cost increases from March 16-17, 2026 [^15585] predating the Gemini 3.5 Flash launch entirely, and a Medium piece documents price hikes that never appeared on the pricing page [^15587] — together suggesting structural billing opacity rather than isolated incidents. Tomasz Tunguz's 'The Unsustainable Subsidy' [^13332] is a new analyst voice lending VC credibility to the industry-repricing thesis. On Gemini 3.5 Pro timing, a May 23 tweet claims it is coming next month [^15408]. Otherwise, most new items are SEO pricing-comparison roundups and social media amplification without new substantive fault lines.

What

Google launched Gemini 3.5 Flash at Google I/O 2026 on May 19, priced at $1.50/M input and $9/M output tokens — 3x the cost of Gemini 3 Flash Preview and up to 6x the cheapest prior Flash option [12][13]. The model replaced Gemini 3 as the web default and powers Google Search AI Mode [6][1]. Google's counter-narrative frames the price as cheap versus GPT-5 and Claude, claiming over $1 billion in annual enterprise savings versus competing providers [20][21] — a framing that conflicts with developers' lived experience of a sharp internal price hike. A parallel billing-anomaly story has expanded: unexplained cost increases dating to at least March 2026 [30], a €54k bill from a single unrestricted Firebase key [31], and reports of price hikes that never appeared on the pricing page [32] suggest Gemini API billing behavior is murkier than the sticker price debate implies.

Why it matters

The pricing debate has split into two factually valid but opposite framings — cheap relative to OpenAI and Anthropic flagships, or expensive relative to prior Flash models — and which frame enterprise buyers adopt will define Gemini 3.5 Flash's market position. The expanding billing-anomaly story raises a sharper question: whether the cost of using the Gemini API is predictable at all, making billing reliability a more fundamental concern than sticker price.

Open questions

Does Google's $1B+ annual enterprise savings claim hold up under independent scrutiny, and what is the actual comparison baseline — GPT-5, Claude flagship, or a broader market average? [20][21]
Do the capability improvements over Gemini 3 Flash justify a 3x internal price increase, or are API developers now paying Pro-tier rates for historically Flash-tier workloads? [12][13][18]
Billing anomalies on Gemini models were reported as early as March 16-17, 2026 [30] — weeks before the 3.5 Flash launch — and include a €54k spike from an unrestricted Firebase browser key [31]; is the billing problem structural, and will Google address it formally?
When does Gemini 3.5 Pro arrive? At least one observer claims it is coming next month [36], but Google has made no formal announcement.

Narrative

Google launched Gemini 3.5 Flash at Google I/O 2026 on May 19 under the tagline 'frontier intelligence with action' [1][2]. The model had been spotted in the Google Cloud Console hours before the official announcement [3][4], alongside a companion model called Gemini Omni Flash [5]. At launch, Gemini 3.5 Flash replaced Gemini 3 as the default on Google's web interface and began powering AI Mode in Google Search [6]. Official documentation — including 'What's New' API guides [7][8], a Google Cloud Enterprise Agent Platform listing [9], and Google DeepMind's model card [10] — formalized its capabilities. The model was also integrated into updated Google AI subscription tiers announced at I/O [11].

The pricing landed at $1.50 per million input tokens and $9 per million output tokens — 3x the cost of Gemini 3 Flash Preview and 6x the cheapest prior Flash-tier option [12][13]. One analysis noted the output token price is nearly 30x more expensive than the previous cheapest Flash alternative [14]. An independent Artificial Analysis benchmark run cost $1,551.60 on Gemini 3.5 Flash versus $892.28 for Gemini 3.1 Pro Preview [12], nearly erasing the historical price gap between Flash and Pro tiers. On capability, Artificial Analysis rated Gemini 3.5 Flash as the 'clear leader on the Intelligence vs Speed Pareto frontier' [15][16], and an early-access user described first impressions as feeling 'on par with Gemini 3.1 Pro Preview' [17]. A Towards AI piece testing the model on 18 agent tasks found it outperformed GPT-5.5 on agentic workloads at 4x the speed, while framing its cost as 6x higher than prior Flash-Lite baselines [18]. One observer summarized the situation tersely: 'Gemini 3.5 Flash beats last year's Pro. The pricing conversation tells the rest.' [19]

Google and aligned voices have pressed a cross-vendor reframing: Gemini 3.5 Flash is cheap relative to GPT-5 and Claude flagships, not expensive relative to prior Flash models. VentureBeat reported Google's claim that the model can slash enterprise AI costs by more than $1 billion a year versus competing providers [20], and AI Business echoed the enterprise cost-efficiency angle [21]. An independent benchmark headline claimed the model scores within two points of Anthropic's flagship at a third of that flagship's price [22]. This reframing has not defused developer frustration: Reddit threads, social media posts, and developer forums continued characterizing the 3x internal price hike as a betrayal of the Flash-tier value proposition [23][24][25], with at least one developer explicitly calling for lower pricing [26]. Analyst Simon Willison documented the cost jump with benchmark data and framed it as part of an industry-wide repricing trend. Venture capitalist Tomasz Tunguz published 'The Unsustainable Subsidy' [27], arguing that current AI pricing reflects an ending land-grab subsidy rather than sustainable economics — lending independent analyst credibility to the thesis that the aggressive discounting era is closing across all major labs.

The billing-anomaly dimension of this story has expanded significantly and now predates the Gemini 3.5 Flash launch itself. A Google AI Developers Forum thread documented unexpected cost spikes on gemini-3-flash-preview despite decreased usage [28][29]. A separate 'URGENT' forum post reveals API cost increases dating to March 16-17, 2026 [30] — nearly two months before Google I/O — suggesting billing behavior changes were underway independently of the new model rollout. A Hacker News thread documented a €54k billing spike in 13 hours from an unrestricted Firebase browser key accessing the Gemini API [31], introducing a security-exposure dimension: predictable billing may require API key access controls that Google has not made sufficiently visible to developers. A Medium piece described a price hike that 'never showed up on the pricing page' yet caused bills to rise 27% [32]. Taken together, these reports suggest that Gemini API billing behavior — its transparency, security assumptions, and change-notification practices — is a distinct and potentially more consequential problem than the sticker-price debate. On tooling, the llm CLI plugin added gemini-3.5-flash support in version 0.32 [33] with a concurrent alpha adding streaming reasoning token support [34]; the llm-gemini GitHub repository [35] serves as the canonical integration point for CLI users. Gemini 3.5 Pro remains unannounced; one social media observer on May 23 claimed it is coming next month [36], but Google has made no formal statement.

Timeline

2026-03-16: Developer forum post documents urgent API cost increases beginning March 16-17, 2026 — weeks before the Gemini 3.5 Flash launch [30]
2026-05-17: Gemini 3.2 Flash-Lite-Live spotted in Google Cloud Console, signaling imminent I/O launches [46]
2026-05-19: Gemini 3.5 Flash and Gemini Omni Flash spotted in Google Cloud Console hours before official Google I/O announcement [47][3][4][5]
2026-05-19: Gemini 3.5 Flash officially launched at Google I/O 2026; priced at $1.50/M input, $9/M output — 3x cost of Gemini 3 Flash Preview; deployed as default on web interface and Google Search AI Mode; featured in updated Google AI subscription tiers; official model documentation and API changelog published [12][13][2][6][48][1][10][11][38][49][7][9][8]
2026-05-19: llm-gemini 0.32 released, adding the gemini-3.5-flash model identifier to the llm CLI plugin; concurrent alpha adds streaming reasoning token support [33][34]
2026-05-19: Artificial Analysis benchmarks rate Gemini 3.5 Flash as 'clear leader on the Intelligence vs Speed Pareto frontier'; early-access user impressions put it on par with Gemini 3.1 Pro Preview; Towards AI 18-agent-task test finds it outperforms GPT-5.5 at 4x speed [15][17][16][18]
2026-05-19: Developer community pricing backlash erupts across social media and forums; posts describe pricing as 'painful', 'expensive', and mark the end of cheap Flash-tier access; migration and alternatives research begins [40][23][24][41][14][42][43][44][25]
2026-05-21: Community observer characterizes the pricing reaction as the worst negative feedback ever seen for a software release; separate tweet notes 'Gemini 3.5 Flash beats last year's Pro. The pricing conversation tells the rest.' [39][19]
2026-05-22: Google's enterprise counter-narrative emerges: VentureBeat reports Google's claim that Gemini 3.5 Flash can cut enterprise AI costs by more than $1 billion a year vs. competing providers; AI Business echoes the enterprise efficiency angle; benchmark coverage frames model as scoring within two points of Anthropic's flagship at a third of the price [20][22][21]
2026-05-23: Scale Advantage posts that Gemini 3.5 Pro is coming next month; developer calls publicly for lower pricing; Tomasz Tunguz publishes 'The Unsustainable Subsidy,' framing industry repricing as the end of an AI land-grab subsidy era [36][26][27]
2026-05-24: HN thread surfaces €54k billing spike in 13 hours from an unrestricted Firebase browser key accessing the Gemini API, adding a security-exposure dimension to the billing anomaly story [31]

Perspectives

Simon Willison

Analytically skeptical of the price increases; documents the cost jump with benchmark data and frames it as part of an industry-wide repricing trend affecting all three major labs. Notes the irony of Google deploying a pricier model in free consumer products while charging API customers near-Pro rates.

Evolution: Consistent across all posts.

[12][33][34][37]

Tomasz Tunguz

Frames current AI API pricing as the end of an unsustainable subsidized land-grab era; the repricing across Google, OpenAI, and Anthropic reflects labs moving toward margin recapture after years of below-cost access.

Evolution: New voice this pass; provides VC/analyst credibility to the broader repricing thesis Willison has been documenting at the model level.

[27]

Artificial Analysis

Positive on capability: rates Gemini 3.5 Flash as the clear leader on the intelligence-vs-speed Pareto frontier with large benchmark gains. Their own cost data ($1,551.60 per benchmark run, vs. $892.28 for Gemini 3.1 Pro Preview) illustrates the price burden even as they endorse the performance.

Evolution: Consistent; full article published this pass [16] in addition to the launch-day benchmark data.

[15][12][16]

Early-access user (Prompt/@engineerrprompt)

Positive on capability: first impressions describe Gemini 3.5 Flash as feeling on par with Gemini 3.1 Pro Preview, implicitly validating the capability-tier collapse.

Evolution: Consistent.

[17]

Google

Positions Gemini 3.5 Flash as a flagship efficiency model with frontier-level intelligence, deploying it as the default in consumer products, Google Search AI Mode, and updated subscription tiers. Has since pivoted to an enterprise cost-savings counter-narrative, claiming the model can cut enterprise AI costs by more than $1 billion a year versus competing providers — reframing the pricing debate from internal comparison (vs. prior Flash) to cross-vendor comparison (vs. GPT-5/Claude).

Evolution: Expanded: enterprise savings counter-narrative is now amplified through trade press coverage [21] in addition to VentureBeat [20].

[1][2][6][11][38][20][21][7][9]

Developer community (broad)

Strongly negative on pricing; characterize the 3x cost increase as a betrayal of the Flash-tier value proposition. Migration research is underway. Billing anomaly complaints — unexplained cost spikes on prior models, a €54k Firebase key exposure, and price increases predating the launch — are deepening distrust beyond sticker-price frustration.

Evolution: Expanded: billing anomaly concerns have grown from a single forum thread to multiple independent reports covering different failure modes (usage anomalies, security exposure, undisclosed price changes), suggesting structural billing opacity rather than isolated incidents.

[39][40][23][24][41][14][42][13][43][44][28][29][31][25][26][30][32]

Cross-vendor benchmark observers

Frame Gemini 3.5 Flash favorably relative to expensive flagship models from Anthropic and OpenAI — scoring within two points of Anthropic's flagship at a third of the price — implicitly supporting Google's enterprise cost narrative by shifting the reference point away from prior Gemini Flash models.

Evolution: Consistent with prior pass; the Towards AI 18-agent-task review [18] adds a practical agentic-workload dimension, finding it outperforms GPT-5.5 at 4x speed despite a 6x price premium vs. Flash-Lite.

[22][18][45]

Tensions

Artificial Analysis and early-access users rate Gemini 3.5 Flash as a genuine capability leader matching or exceeding Gemini 3.1 Pro [15][17][16], while the developer community broadly rejects the pricing as unjustifiable for a Flash-tier model [23][24][25] — a standoff between benchmark performance and cost tolerance. [15][17][16][23][24][25]
Google and cross-vendor benchmarks frame Gemini 3.5 Flash as cheap relative to GPT-5 and Claude flagship models [20][22][21], while API developers experience it as 3x more expensive than the prior Flash model they were paying for [12][13] — two pricing comparisons that are both factually accurate but point to opposite conclusions about affordability. [20][22][21][12][13]
Google positions Gemini 3.5 Flash as a Flash-tier efficiency model and deploys it free in consumer products and updated subscription tiers [6][11] while its API pricing nearly matches Gemini 3.1 Pro [12][13] — an unresolved tension between the 'Flash' branding promise and the cost burden placed on API customers. [12][13][6][11][1]
Google's stable public pricing pages imply a predictable billing environment, but developer forum threads document cost increases dating to March 2026 [30], a €54k Firebase key exposure [31], and hidden price hikes not reflected on the pricing page [32] — a tension between Google's official pricing narrative and the billing opacity developers are actually experiencing. [30][31][32][28][29]

Sources

[1] Gemini 3.5: frontier intelligence with action - Google Blog — reactive:google-io-gemini-launch
[2] Google shipped Gemini 3.5 Flash at I/O 2026. Marketing says "frontier intelligence with action." — reactive:gemini-35-flash-release (2026-05-20)
[3] Gemini 3.5 Flash is now visible within the Google Cloud Console, preceding its official announcement at Google I/O. It a... — reactive:google-io-2026-launch-blitz (2026-05-19)
[4] 🧵 Gemini 3.5 Flash Appears in Google Cloud Console (May 19, 2026) — reactive:gemini-35-flash-release (2026-05-19)
[5] Google DeepMind have released Gemini Omni Flash and Gemini 3.5 Flash — reactive:google-io-2026-launch-blitz (2026-05-19)
[6] Gemini 3.5 Flash is officially live, replacing Gemini 3 as the default option on the web interface and powering Google S... — reactive:gemini-35-flash-release (2026-05-20)
[7] What's new in Gemini 3.5 Flash - Interactions API — reactive:google-io-2026-gemini
[8] What's new in Gemini 3.5 Flash - generateContent API — reactive:google-io-2026-launch-blitz
[9] Gemini 3.5 Flash | Gemini Enterprise Agent Platform — reactive:google-io-2026-gemini
[10] Gemini 3.5 Flash - Model Card - Google DeepMind — reactive:google-io-gemini-launch
[11] Everything new in our Google AI subscriptions, fresh from I/O 2026 — reactive:google-io-gemini-launch
[12] Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — Simon Willison (2026-05-19)
[13] 3.5 Flash pricing: $1.50/million input tokens, $9/million output. That's 3x Gemini 3 Flash Preview and 6x the 3.1 Flash-... — reactive:gemini-35-flash-release (2026-05-20)
[14] Gemini 3.5 Flash now costs $9 per million output tokens , 3x the price of Gemini 3 Flash and nearly 30x more than Gemini... — reactive:gemini-35-flash-release (2026-05-19)
[15] Google’s new Gemini 3.5 Flash is the clear leader on the Intelligence vs Speed Pareto frontier and makes large gains on ... — reactive:google-io-2026-launch-blitz (2026-05-19)
[16] Gemini 3.5 Flash: The new leader in intelligence versus speed — reactive:google-io-gemini-launch
[17] Gemini 3.5 Flash, first impressions. I had early access to the model, and it feels on par with Gemini 3.1 Pro Preview. T... — reactive:google-io-2026-launch-blitz (2026-05-19)
[18] I Tested Gemini 3.5 Flash on 18 Agent Tasks — Google's 6× Pricier ... — reactive:google-io-2026-gemini
[19] Gemini 3.5 Flash beats last year's Pro. The pricing conversation tells the rest. — reactive:gemini-35-flash-release (2026-05-21)
[20] Google says Gemini 3.5 Flash can slash enterprise AI costs by more ... — reactive:gemini-35-flash-release
[21] Google Aims at Enterprise Cost Efficiency With Gemini 3.5 Flash — reactive:gemini-35-flash-release
[22] Google's Gemini 3.5 Flash scores within two points of Anthropic's ... — reactive:gemini-35-flash-release
[23] gemini 3.5 flash pricing is painful. — reactive:gemini-35-flash-release (2026-05-19)
[24] Gemini 3.5 Flash pricing has changed - the good times are over! It’s now 3x more expensive than Gemini 3 Flash. https://... — reactive:gemini-35-flash-release (2026-05-19)
[25] Gemini 3.5 flash costs 3 times more than the previous version and ... — reactive:gemini-35-flash-release
[26] @kr0der For now, I’d really like to see lower pricing for — reactive:gemini-35-flash-release (2026-05-23)
[27] The Unsustainable Subsidy | Tomasz Tunguz — reactive:codex-practical-dev-tool
[28] Sudden Cost Spike with gemini-3-flash-preview Despite Decreased ... — reactive:gemini-35-flash-release
[29] Sudden Cost Spike with gemini-3-flash-preview Despite Decreased Usage (April 2026) - Page 2 - Gemini API - Google AI Developers Forum — reactive:gemini-35-flash-release
[30] URGENT: The cost of API access has increased since March 16-17, 2026 - Gemini API - Google AI Developers Forum — reactive:gemini-35-flash-release
[31] €54k spike in 13h from unrestricted Firebase browser key accessing ... — reactive:gemini-35-flash-release
[32] The AI price hike that never showed up on the pricing page ... — reactive:gemini-35-flash-release
[33] llm-gemini 0.32 — Simon Willison (2026-05-19)
[34] llm-gemini 0.32a0 — Simon Willison (2026-05-19)
[35] LLM plugin to access Google's Gemini family of models - GitHub — reactive:gemini-35-flash-release
[36] Gemini 3.5 pro coming out next month and might be amazing. But the only major data point since Gemini 3 release 6mo ago ... — reactive:google-io-2026-launch-blitz (2026-05-23)
[37] Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — reactive:gemini-35-flash-release
[38] Release notes | Gemini API - Google AI for Developers — reactive:google-io-gemini-launch
[39] @andyzhang @antigravity I never saw such a negative feedback in a software release before. Replacing Gemini 3 Flash quot... — reactive:google-io-2026-launch-blitz (2026-05-21)
[40] Gemini 3.5 Flash Pricing: 😢 — reactive:gemini-35-flash-release (2026-05-19)
[41] GEMINI 3.5 FLASH MIGHT NOT BE CHEAP AT ALL — reactive:gemini-35-flash-release (2026-05-19)
[42] Gemini 3.5 Flash is EXPENSIVE. — reactive:gemini-35-flash-release (2026-05-19)
[43] Gemini 3.5 Flash vs Gemini 3 Flash Preview: Pricing & Migration ... — reactive:gemini-35-flash-release
[44] Google just dropped Gemini 3.5 Flash and the price hike is pretty ... — reactive:gemini-35-flash-release
[45] @GeminiApp 3.5 Flash just dropped at Google I/O. It outperforms Gemini 3.1 Pro across almost all benchmarks — and runs ... — reactive:gemini-35-flash-release (2026-05-24)
[46] UPDATE: Gemini 3.2 Flash-Lite-Live just spotted in the Google Cloud Console. — reactive:google-io-2026-launch-blitz (2026-05-17)
[47] Gemini 3.5 Flash (In Console) and Gemini Omni are nearing release, with only two hours remaining until Google I/O. https... — reactive:google-io-2026-launch-blitz (2026-05-19)
[48] Google’s official pricing page has been updated to include Gemini 3.5 Flash. — reactive:gemini-35-flash-release (2026-05-19)
[49] Google launches Gemini 3.5 Flash. How to try it for free. | Mashable — reactive:google-io-gemini-launch