BreakingNews
Claude Sonnet 5 Closes the Gap With Opus on Agentic AI Tasks

Image: Flickr / Wikimedia Commons / Unsplash

Claude Sonnet 5 Closes the Gap With Opus on Agentic AI Tasks

Anthropic's new mid-tier model brings near-Opus performance, a 1M token context window, and tighter safety guardrails at a lower price.

June 30, 20265 min read

This article was produced by the AETW editorial team.

Anthropic released Claude Sonnet 5 on June 30, 2026, an agentic mid-tier model that narrows the performance gap with Opus 4.8 while undercutting it and rival models on price.

The agentic gap is closing

Anthropic shipped Claude Sonnet 5 on June 30, 2026, calling it the most agentic Sonnet-class model the company has built. The pitch is straightforward: the model plans multi-step work, operates browsers and terminals, and carries tasks through to completion without the hand-holding earlier Sonnet versions needed.

That matters because Sonnet has historically been the budget option next to Anthropic's Opus tier. Sonnet 3.5, 3.6, and 3.7 introduced agentic coding to a broad developer base, but the biggest capability jumps over the past year landed in Opus-class models instead. Sonnet 5 is built to close that gap rather than just iterate on its predecessor, Sonnet 4.6.

Anthropic says Sonnet 5 now performs close to Opus 4.8 on reasoning, tool use, coding, and general knowledge work, while charging meaningfully less per token. For US teams running high request volumes, that shift changes which model becomes the default for production traffic rather than the occasional escalation.

Pricing undercuts Opus, GPT-5.5, and Gemini 3.1 Pro

Sonnet 5 is available immediately across Free, Pro, Max, Team, and Enterprise plans, and it is now the default model on Free and Pro. It also ships in Claude Code, on the Claude Platform, and through the Claude API under the model ID claude-sonnet-5.

Introductory API pricing runs at $2 per million input tokens and $10 per million output tokens through August 31, 2026, after which it moves to standard pricing of $3 per million input tokens and $15 per million output tokens. Anthropic notes Sonnet 5 uses an updated tokenizer that can map the same input to roughly 1.0 to 1.35 times more tokens than before, and the introductory pricing is set to keep the transition close to cost-neutral.

For US developers comparing options, that puts Sonnet 5 below Opus 4.8's $5/$25 pricing, and reporting indicates it also undercuts OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro on a per-token basis, while still costing more than Google's lighter Gemini 3.5 Flash. Anthropic also raised rate limits across Chat, Cowork, Claude Code, and the Claude Platform to handle the heavier token usage that comes with higher effort levels.

What the benchmarks actually show

Claude Sonnet 5

Source: Anthropic

On agentic coding, Anthropic and outside reporting put Sonnet 5 at roughly 63.2%, ahead of Sonnet 4.6's 58.1% and within range of Opus 4.8's 69.2%. On a knowledge-work benchmark, Sonnet 5 reportedly edges past Opus 4.8 outright, which is unusual for a mid-tier model and underscores how much of this release is about narrowing specific gaps rather than a flat across-the-board upgrade.

The model also ships with a 1 million token context window, enough to hold a large codebase or a long stack of documents in a single request without a long-context surcharge. Anthropic published cost-performance curves on BrowseComp, an agentic search evaluation, and OSWorld-Verified, a computer-use benchmark, showing Sonnet 5 as a consistent improvement over Sonnet 4.6 at every effort level, with Opus 4.8 still ahead on raw accuracy.

Anthropic frames the choice between the two models as a dial rather than a binary: Sonnet 5 for high-volume production work where cost per task matters, Opus 4.8 when a task genuinely needs the highest accuracy available and the price difference is justified.

The safety tradeoffs builders should know

Anthropic's pre-deployment evaluations found Sonnet 5 shows a lower overall rate of undesirable behavior than Sonnet 4.6, including reduced hallucination and sycophancy, and better resistance to prompt-injection hijack attempts. On the company's automated behavioral audit, which checks for things like cooperation with misuse and deception, Sonnet 5 scored safer than its predecessor, though still somewhat behind Opus 4.8 and the limited-availability Claude Mythos Preview.

On cybersecurity specifically, Anthropic says it did not deliberately train Sonnet 5 on offensive cyber tasks, and the model showed substantially weaker results than Opus-class models on tests like developing working exploits for browser vulnerabilities. It never produced a full working exploit in Anthropic's Firefox-based evaluation, though it showed a slightly higher rate of partial success than Sonnet 4.6.

Because of that uptick, Sonnet 5 ships with the same real-time cyber safeguards used in Opus 4.7 and 4.8, enabled by default rather than opt-in. For security and compliance teams, that is the detail worth flagging before greenlighting Sonnet 5 for sensitive internal tooling.

Early signals from US enterprise users

Anthropic's launch partners point to the same pattern: tasks that used to stall partway now finish end to end. A Zapier engineer described handing Sonnet 5 a two-part job, updating Salesforce account tiers and sending a launch announcement to enterprise contacts, and having it complete the full workflow unattended. Teams at ClickHouse and Pace cited faster time-to-insight and more reliable multi-step execution in computer-use agents running real operational workflows like insurance submission intake.

Distribution is also wider than past Sonnet launches. Sonnet 5 went generally available in GitHub Copilot on day one for Pro, Pro+, Max, Business, and Enterprise users, putting it directly in front of the IDE and CLI workflows most US engineering teams already use.

The release lands the same week OpenAI previewed GPT-5.5 Sol and roughly a month after Google's Gemini 3.5 Flash, both pitched around the same agentic framing. For US builders choosing between providers, Sonnet 5's combination of price, context window, and Claude Code integration makes it a reasonable new default rather than a niche upgrade.

Sources

Brian Weerasinghe

AI & Technology Researcher

Brian Weerasinghe is the founder and editor of AI Eating The World, where he covers artificial intelligence, tech companies, layoffs, startups, and the future of work. His reporting focuses on how AI is transforming businesses, products, and the global workforce. He writes about major developments across the AI industry, from enterprise adoption and funding trends to the real-world impact of automation and emerging technologies.

Trusted AI LeaderTrusted AI LeaderTrusted AI LeaderTrusted AI Leader
Trusted by 10,000+ builders

The AI brief for people adapting to changes in work

Join readers tracking AI news, workflow shifts, and practical tools they can use to adapt faster.

Free, no spam, unsubscribe anytime.