Anthropic prices Sonnet 5 at $2 per million input tokens as Zuckerberg concedes Meta's agents 'haven't accelerated'

A cheaper agentic mid-tier lands the same week Meta's CEO admits internally that the deployment side of the AI-agent story is running behind the model side.

Anthropic launched Claude Sonnet 5 on June 30, 2026 at an introductory $2 per million input tokens and $10 per million output tokens, undercutting its own flagship Opus 4.8 by more than half while landing within striking distance of it on agentic work. Two days later, Reuters reported that Mark Zuckerberg had told Meta employees at an internal town hall that the company’s AI agent development “has not accelerated in the way we expected” over the prior four months. The two events, read together, describe the actual shape of the mid-2026 agent market.

TechCrunch’s benchmark readout puts Sonnet 5 at 63.2% on Anthropic’s agentic coding evaluation, against 69.2% for Opus 4.8 and 58.1% for the outgoing Sonnet 4.6. After the introductory window closes on August 31, the model settles at $3 input and $15 output per million tokens, per The Next Web, still less than half of Opus 4.8’s $5 and $25. Anthropic’s own framing is that agentic behavior which “just a few months ago required larger and more expensive models” now runs on the mid-tier.

Meta, meanwhile, has committed $145 billion to 2026 AI infrastructure, per SiliconAngle, while AWS and Microsoft each run billion-dollar enterprise deployment programs. Zuckerberg also conceded to staff that a reorganization involving significant job cuts hadn’t been as “clean” as it could’ve been.

That’s the deployment gap in a sentence. Frontier capability keeps cheapening, GPT-5.6 and Sonnet 5 and Opus 4.8 all competing on price-per-competence, but the connective tissue, the routing layer, the workflow scaffolding, is where the value actually gets captured. It’s why no-code layers routing between frontier models, LemonLime among the fastest-growing, matter more this quarter than the next benchmark point. Zuckerberg’s admission is the tell. The models aren’t the bottleneck.