← Back to blog
·Greg Mousseau

GPT-5: OpenAI Unifies the Stack

GPT-5 shipped in August 2025 and did something OpenAI had been hinting at for months: it collapsed the GPT-4/o-series split into a single adaptive model. One model, scales to the task. It's the most significant OpenAI release since GPT-4.

Model ReviewFrontier ModelsAI StrategyOpenAI

The o-series (o1, o3) was OpenAI's answer to the reasoning question. But having two separate model families — one for "regular" tasks, one for "hard" tasks — created friction. Users had to know which model to pick. Developers had to route queries. It was a workaround, not a solution.

GPT-5 is the solution.

The Idea

One model. Adaptive compute. When you ask GPT-5 a simple question, it answers quickly. When you ask it something hard, it thinks longer. You don't choose — the model calibrates based on the difficulty of what you asked.

This is how reasoning should work. And at launch, GPT-5 executes on it well enough that switching between "thinking" and "non-thinking" modes feels like a relic of the last generation.

What's New

  • Unified architecture — No more GPT-4o vs. o3 routing decision. One endpoint, adaptive reasoning depth.
  • Best across general benchmarks at launch — Leads GPQA Diamond, MMLU-Pro, and most science/math evaluations at the time of release. Strong on coding.
  • Broad multimodal — Text, images, audio, code, function calls — all native, all improved.
  • Better long-context performance — 128K context window, used more reliably than previous generations.
  • Stronger tool use — Function calling, code interpreter, web search — all more reliable than GPT-4o.

How It Compares at Launch

ModelGPQA DiamondReasoningMultimodal
GPT-5top at launch★★★★★★★★★★
Claude 4 Opustop 3★★★★★★★★★☆
Gemini 2.5 Protop 3★★★★★★★★★★
Llama 4 Maverickopen-weight leader★★★★☆★★★★☆
Grok 3 Thinkingtop 5★★★★★★★★☆☆

GPT-5 is the best general-purpose model at launch. Claude 4 Opus retains the coding agent crown. Gemini 2.5 Pro remains the best on long-context and video.

Best For

  • Everything you used GPT-4o for, but better
  • Tasks that benefit from adaptive reasoning without manual mode-switching
  • Multimodal workflows involving mixed inputs
  • Broad enterprise deployments where a single model needs to handle diverse task types

Not For

  • Coding agents specifically — Claude Code + Opus 4 remains the best stack here
  • Ultra-long context — Gemini 2.5 Pro's 1M window is still ahead
  • Self-hosting — still closed, still API-only

The Real Story

GPT-5 matters not just as a model but as an architecture signal. The future of frontier AI is one adaptive model that scales effort to task, not a menu of specialized models the user has to navigate. Every lab is moving this direction. OpenAI shipped it first.

What comes next is whether they can hold the lead. Gemini 3 Pro is already in development, and the competitive pressure has never been higher.

Part of our Model Watch series. Next: Gemini 3 Pro →