Claude Opus 4.5: Anthropic's Coding Lead Widens

Six months after Claude 4 Opus, Anthropic ships the refresh. The headline: agentic coding gets better. Again.

Opus 4.5 isn't a dramatic architectural change — it's a continued refinement of the same hybrid (instant + thinking) model, with focused improvements on the things Anthropic cares most about: tool use, multi-step reasoning, and coding agent reliability.

What Improved

Terminal-Bench Hard — Scores near the top of the agentic coding leaderboard at launch. Multiple third-party agents using Opus 4.5 as their model core debut in the top 10.
Tool use reliability — Fewer dropped tool calls, better recovery from tool errors mid-task. This matters a lot in long-running agentic workflows.
Extended thinking + tool use — The combination is more stable. Opus 4.5 maintains coherent context over longer chains of think → call → observe → think.
Instruction following — Continued reduction in unnecessary refusals. Follows precise technical instructions without defaulting to hedged responses.

How It Compares at Launch

Model	Terminal-Bench Hard	Agentic Tool Use	Notes
Claude Opus 4.5	top 3	★★★★★	Best coding agent at launch
Gemini 3 Pro	competitive	★★★★☆	Better general reasoning
GPT-5	solid	★★★★☆	Better broad capability
Llama 4 Maverick	strong open-wt	★★★★☆	Best open-weight option
Opus 4.0	previous leader	★★★★☆	Replaced by 4.5

Best For

Everything you'd use Claude Code for — this is the engine underneath
Multi-step coding tasks with file system access, test runs, iteration
Any agentic workflow that needs reliable tool use over 10+ steps

Not For

General conversation — Sonnet 4.5 (also updated) is faster and cheaper for this
Multimodal-first tasks — Gemini 3 Pro is stronger here
Open-weight deployments — still closed

Verdict

Opus 4.5 cements Anthropic's position as the leader for agentic software development. The gap on Terminal-Bench Hard between Anthropic's stack and everyone else is meaningful. If you're building AI coding workflows, there's still not a better model.

Part of our Model Watch series.