Agents That Fix Their Own Vulnerabilities

Anthropic shipped Claude Code Security in late February. OpenAI shipped Codex Security on March 6th. Two weeks apart.

That gap matters. When two of the largest AI labs race to the same capability within weeks of each other, it stops being a product announcement and starts being a category. Autonomous security agents are here. The question now is what you do about it.

What these tools actually do

Codex Security isn't just a smarter linter. The process OpenAI describes has three steps: the agent builds a threat model of your repo's security-relevant structure, then scans for vulnerabilities using that context (not just regex patterns), then validates findings in a sandboxed environment before surfacing them.

That last part is worth paying attention to. One of the main reasons security tooling gets ignored is false positives. Teams get flooded with noise, start dismissing alerts, and miss the real issues. During Codex Security's 30-day external beta, the false positive rate dropped more than 50%. That's a practical win, not just a benchmark number.

The scale was real too: 1.2 million commits scanned, 792 critical findings, 10,561 high-severity findings. CVEs were issued against actual production projects: GnuPG, GnuTLS, Chromium, libssh. These aren't toy examples.

Claude Code Security launched a few weeks earlier and takes a different approach. Anthropic's strength is long-context reasoning, which gives it an edge in complex or regulated codebases where understanding how components interact matters more than pattern matching. The two tools are legitimately different, not just rebranded versions of the same thing.

The model has changed

For years, the standard workflow was: human writes code, human asks AI for help, human applies fixes. The AI was a tool you reached for.

What's shipping now is different. The agent runs continuously, scans your repo, validates findings, and proposes patches aligned with existing code behavior. You review and merge. The human is still in the loop, but the agent is doing the investigation on its own.

If you've run Claude Code or OpenClaw, this pattern is familiar. Background agents aren't new conceptually. What's new is that it's been packaged, productized, and pointed at security specifically. The category has a name now.

For most development teams, this is the first time they're seriously considering giving an agent write access to their codebase without a human in the room.

Who's accountable?

Here's the question I keep coming back to: when a security agent opens a PR, proposes a patch, and a junior dev merges it without fully understanding the change, and something breaks in production, what happened?

This isn't hypothetical. It's the exact failure mode that shows up when any automation gets trusted faster than it should be. The agent found a real vulnerability and proposed a real fix. The fix had a side effect nobody caught. Production went down.

The governance infrastructure most teams have isn't built for this. Code review processes assume a human wrote the code and can explain the decision. When an agent writes it, that accountability chain gets murky. Who signed off? Who owns the logic? Who gets paged at 3am?

I'm not arguing against these tools. The capabilities are real and the false-positive improvements are meaningful. But "trust the agent" is not a governance model.

At minimum, teams need clear policies before they flip the switch:

What categories of changes can an agent propose without extra review?
Who is responsible for reviewing agent-generated patches?
What's the escalation path when an agent fix creates a regression?
How do you audit what the agent touched and when?

Most teams don't have written answers to any of those questions yet. That's the gap.

Your options

If you want to evaluate this space now, the practical breakdown looks like this.

Codex Security makes sense if you're already in the OpenAI ecosystem. The sandboxed validation is a genuine differentiator. It's in research preview, free for 30 days for ChatGPT Pro/Enterprise/Business/Edu customers. Good signal-to-noise ratio is worth testing.

Claude Code Security is the better fit for complex or regulated codebases where understanding context matters. If your team already runs Claude Code for development, the integration is natural. Anthropic's long-context reasoning shows up in the places where most security scanners fall flat.

Roll your own is also a legitimate option. With Claude Code and OpenClaw, you can set up nightly repo scans, file GitHub issues for findings, and customize the workflow to match how your team actually operates. This takes more setup but gives you full control over what the agent can and can't do, and it keeps the findings inside your tooling rather than passing them through a third-party service.

That last option is exactly what we help teams build through our Agent in a Day offering. Instead of buying a black-box security product, you get a workflow that fits your stack and stays in your control.

On choosing between them

I've been using Claude Code as my daily driver, so I'm not a neutral party here. But the honest comparison is that these tools solve slightly different problems. Codex Security is strong on scan speed and false-positive reduction at scale. Claude Code Security is stronger when the codebase is complex enough that pattern matching misses the real issues.

Neither one replaces a security-focused engineer. They make that engineer significantly more effective. If you don't have one, they don't fill the gap in the way the press releases imply.

The thing I'd actually watch is what happens at the agentic layer over the next six months. Both OpenAI and Anthropic are clearly moving toward agents that don't just propose patches but can also open PRs, run tests, and iterate. That's where the governance question gets harder.

What teams with agent-heavy workflows need

If you're running multiple agents, or thinking seriously about giving agents write access to production infrastructure, you probably need someone thinking about the oversight layer, not just the capability layer.

That's what our Fractional AI Lead is for. Not picking the best model (that changes every six weeks anyway), but building the policies, review processes, and accountability structures that let your team actually use these tools safely. The capability questions are mostly answered. The governance questions are where teams get stuck.

Two major labs shipped autonomous security agents in two weeks. That pace isn't slowing down. Getting your internal processes ready is the work that doesn't make headlines but matters more.