← Back to blog
·Greg Mousseau

Claude 4 Opus: Anthropic Goes All-In on Coding Agents

Claude 4 Opus launched May 22, 2025 alongside Claude 4 Sonnet — Anthropic's biggest model release since Claude 3. The headline: best coding model in the world at launch, with a clear agentic focus that signals where AI software development is going.

Model ReviewFrontier ModelsAI StrategyAnthropic

Every few releases, a model ships that genuinely changes how you work. Claude 4 Opus is one of those.

The benchmark story is good — best SWE-bench Verified score at launch, beats Gemini 2.5 Pro and GPT-4.5 on software engineering tasks. But the more interesting story is the product design: Anthropic is shipping Claude 4 Opus as the core of a coding agent, not just a completion model.

What's New

Alongside the model:

  • Claude 4 Opus — Anthropic's new flagship. Hybrid architecture (instant + extended thinking). Best coding model at launch.
  • Claude 4 Sonnet — Also ships today. Cheaper, faster, almost as capable for most tasks. The model to use by default.
  • Claude Code GA — The terminal coding agent moves from preview to general availability. Opus 4 is the engine.

The capabilities that matter:

  • Coding at a new level — SWE-bench Verified scores top the leaderboard at launch. On complex, real-world software engineering tasks, Opus 4 consistently outperforms Gemini 2.5 Pro and the o-series models.
  • Computer use — Controlling a browser or desktop as part of an agentic workflow. Not perfect, but functional enough to automate real tasks.
  • Extended thinking + tool use, combined — The model can reason, use a tool, observe the result, reason again. Long chains of tool calls with preserved context.
  • Better instruction following — Significantly reduced refusals on legitimate tasks. The model does what you ask.

How It Compares at Launch

ModelSWE-benchAgentic Tool UseExtended Thinking
Claude 4 Opus#1★★★★★
Gemini 2.5 Protop 3★★★★☆
GPT-4.5moderate★★★☆☆
Claude 3.7 Sonnettop 3★★★★☆
Llama 4 Maverickopen-weight leader★★★☆☆

Best For

  • Agentic coding — If you're building with Claude Code or API-based coding agents, Opus 4 is the obvious choice
  • Multi-step workflows with tool use and computer control
  • Hard coding problems where extended thinking materially improves results
  • Software engineering research — the SWE-bench lead here is substantial

Use Sonnet 4 Instead When

  • You need 80% of the quality at 20% of the cost
  • Response latency matters (Sonnet is significantly faster)
  • Your task is relatively straightforward (summarization, Q&A, generation)

Verdict

Claude 4 Opus is the best coding model in the world as of May 22, 2025, and the GA of Claude Code makes that capability accessible to any developer. The combined model + agent story is the most coherent one in the industry right now. Anthropic has been quietly building the scaffolding for AI-assisted software development for 18 months; Opus 4 is where that work becomes obviously useful.

Part of our Model Watch series. Next: GPT-5 →