Claude 4 Opus: Anthropic Goes All-In on Coding Agents
Claude 4 Opus launched May 22, 2025 alongside Claude 4 Sonnet — Anthropic's biggest model release since Claude 3. The headline: best coding model in the world at launch, with a clear agentic focus that signals where AI software development is going.
Every few releases, a model ships that genuinely changes how you work. Claude 4 Opus is one of those.
The benchmark story is good — best SWE-bench Verified score at launch, beats Gemini 2.5 Pro and GPT-4.5 on software engineering tasks. But the more interesting story is the product design: Anthropic is shipping Claude 4 Opus as the core of a coding agent, not just a completion model.
What's New
Alongside the model:
- Claude 4 Opus — Anthropic's new flagship. Hybrid architecture (instant + extended thinking). Best coding model at launch.
- Claude 4 Sonnet — Also ships today. Cheaper, faster, almost as capable for most tasks. The model to use by default.
- Claude Code GA — The terminal coding agent moves from preview to general availability. Opus 4 is the engine.
The capabilities that matter:
- Coding at a new level — SWE-bench Verified scores top the leaderboard at launch. On complex, real-world software engineering tasks, Opus 4 consistently outperforms Gemini 2.5 Pro and the o-series models.
- Computer use — Controlling a browser or desktop as part of an agentic workflow. Not perfect, but functional enough to automate real tasks.
- Extended thinking + tool use, combined — The model can reason, use a tool, observe the result, reason again. Long chains of tool calls with preserved context.
- Better instruction following — Significantly reduced refusals on legitimate tasks. The model does what you ask.
How It Compares at Launch
| Model | SWE-bench | Agentic Tool Use | Extended Thinking |
|---|---|---|---|
| Claude 4 Opus | #1 | ★★★★★ | ✅ |
| Gemini 2.5 Pro | top 3 | ★★★★☆ | ✅ |
| GPT-4.5 | moderate | ★★★☆☆ | ❌ |
| Claude 3.7 Sonnet | top 3 | ★★★★☆ | ✅ |
| Llama 4 Maverick | open-weight leader | ★★★☆☆ | ✅ |
Best For
- Agentic coding — If you're building with Claude Code or API-based coding agents, Opus 4 is the obvious choice
- Multi-step workflows with tool use and computer control
- Hard coding problems where extended thinking materially improves results
- Software engineering research — the SWE-bench lead here is substantial
Use Sonnet 4 Instead When
- You need 80% of the quality at 20% of the cost
- Response latency matters (Sonnet is significantly faster)
- Your task is relatively straightforward (summarization, Q&A, generation)
Verdict
Claude 4 Opus is the best coding model in the world as of May 22, 2025, and the GA of Claude Code makes that capability accessible to any developer. The combined model + agent story is the most coherent one in the industry right now. Anthropic has been quietly building the scaffolding for AI-assisted software development for 18 months; Opus 4 is where that work becomes obviously useful.
Part of our Model Watch series. Next: GPT-5 →
