a daily news desk
Tools

Gemini 3.5 Flash hits GA, beats Google's own 3.1 Pro on agent benchmarks

Google shipped the mid-tier model at I/O on Tuesday and made it the default across Search, the Gemini app and Antigravity.

Google made Gemini 3.5 Flash generally available at I/O on Tuesday, and the mid-tier model now outperforms the company’s own previous flagship, Gemini 3.1 Pro, on every agentic benchmark Google chose to publish: 76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas, 1656 Elo on GDPval-AA, and 84.2% on CharXiv Reasoning. Google also claims a 4× output-throughput advantage over other frontier models, and up to 12× speedups inside Antigravity.

The naming is doing real work here. By branding the model “Flash” while shipping numbers that beat the prior “Pro,” Google is telling developers the agent stack, not the chatbot, is the product surface that matters now. Tulsee Doshi, senior director and head of product at Google, framed Gemini 3.5 Pro (due next month) as the “orchestrator” with Flash instances acting as “various sub-agents.” Koray Kavukcuoglu, chief technologist at DeepMind, said the new model “offers an incredible combination of quality and low latency.”

Independent numbers complicate the marketing. Artificial Analysis, as relayed by Latent Space, scored Gemini 3.5 Flash at 55 on its Intelligence Index, a 9-point jump over Gemini 3 Flash, at over 280 output tokens per second, with 84% on MMMU-Pro. Pricing, though, runs $1.50 per million input tokens and $9.00 per million output tokens. That’s 5.5× costlier than Gemini 3 Flash and 75% more expensive than the 3.1 Pro it just beat. The “Flash” tier is no longer the cheap tier.

Distribution is the louder story. Gemini 3.5 Flash is now the default across the Gemini app, AI Mode in Search, Antigravity, and the Gemini API via AI Studio and Android Studio, with a 1M-token context window, 65k max output, four thinking levels, and thought preservation across turns. It also powers Gemini Spark, Google’s new 24/7 personal agent.

Enterprise customers don’t get a choice for long. The release notes confirm GA in Global, US and EU regions, with Gemini 2.5 Flash pulled from the selector and the off-toggle disappearing on June 8. The mid-tier isn’t a tier anymore. It’s the floor.

Sources