OpenAI's GPT-5.5 Cyber AI Outperforms Anthropic's Mythos Model

Created at 23 Jun · 7:09 PM1 source↑ Market-relevant

IN SHORT

OpenAI's GPT-5.5 Cyber AI has surpassed Anthropic's Mythos model in cybersecurity evaluations, achieving a higher success rate on advanced tasks. This development occurs as Anthropic's models face an export ban under the Trump administration.

Key Numbers

71.4%GPT-5.5 average pass rate on expert cyber tasks

68.6%Mythos Preview average pass rate on expert cyber tasks

52.4%GPT-5.4 average pass rate on expert cyber tasks

48.6%Opus 4.7 average pass rate on expert cyber tasks

20 hoursEstimated human time for corporate network attack simulation

95Number of narrow cyber tasks in evaluation suite

27Number of Practitioner advanced cyber tasks

21Number of Expert advanced cyber tasks

20 trillionTokens OpenAI has subsidized for Codex Security scanner

Who's Involved

OpenAI

Developer of GPT-5.5 Cyber AI model

Anthropic

Developer of Mythos AI model

Trump administration

Implemented export ban affecting Anthropic's models

AISI

↳ Why This Matters

The superior performance of OpenAI's GPT-5.5 Cyber AI in cybersecurity tasks, coupled with Anthropic's models being offline due to export restrictions, highlights a shift in the AI cybersecurity landscape. OpenAI's proactive approach with initiatives like 'Patch the Planet' suggests a strategy to lead in AI-driven cybersecurity solutions while addressing open-source vulnerabilities.

Key facts

OpenAI's GPT-5.5 Cyber AI achieved a 71.4% success rate on expert-level advanced cyber tasks.

Anthropic's Mythos Preview model achieved a 68.6% success rate on the same tasks.

GPT-5.5 represents a similar level of cyber performance to previous frontier models, indicating a broader trend.

Anthropic's models are currently offline due to a Trump administration export ban.

OpenAI has launched 'Patch the Planet' to assist open-source projects in improving their security with AI tools.

OpenAI's GPT-5.5 Cyber AI has demonstrated superior performance in cybersecurity evaluations compared to Anthropic's Mythos model, according to recent assessments. The CyberGym leaderboard shows GPT-5.5 achieving a higher success rate on advanced tasks, including complex reverse engineering and exploitation challenges.

AISI's evaluations found that GPT-5.5 achieved an average pass rate of 71.4% on expert-level tasks, surpassing Mythos Preview's 68.6%. This indicates a broader trend in AI's cyber capabilities rather than a breakthrough specific to one model. The advanced tasks, developed with cybersecurity firms, probe skills like vulnerability research and exploitation against realistic targets.

This development comes as Anthropic's models, including Mythos, are reportedly offline due to an export ban imposed by the Trump administration. OpenAI, meanwhile, is expanding its efforts in cybersecurity AI, offering "trusted access" to its models and launching initiatives like 'Patch the Planet'.

'Patch the Planet,' a collaboration with Trail of Bits, HackerOne, and Calif., aims to help open-source projects improve their security by leveraging AI tools for vulnerability detection and patching. OpenAI is subsidizing usage of its Codex Security scanner for open-source and private code, with over 30 projects already participating.

Frequently asked questions

The CyberGym leaderboard tracks the performance of AI models on a suite of cybersecurity tasks designed to evaluate skills like vulnerability research and exploitation.

Anthropic's models are reportedly offline due to an export ban implemented by the Trump administration.

'Patch the Planet' is an initiative by OpenAI and partners to help open-source software projects improve their security by finding and patching vulnerabilities using AI tools.

The evaluations include tasks such as reverse engineering, web exploitation, cryptography, and vulnerability exploitation against realistic targets and modern mitigations.

OpenAI's GPT-5.5 Cyber AI Outperforms Anthropic's Mythos Model

Key Numbers

Who's Involved

OpenAI's GPT-5.5 Cyber AI Outperforms Anthropic's Mythos Model

Key Numbers

Who's Involved

↳ Why This Matters

Key facts

Frequently asked questions

What Happens Next

Get the newsletter.

How It Developed

Sources

Related Stories

OpenAI's GPT-5.5 Cyber AI Outperforms Anthropic's Mythos Model

PiQ Daily

Key Numbers

Who's Involved

OpenAI's GPT-5.5 Cyber AI Outperforms Anthropic's Mythos Model

PiQ Daily

Key Numbers

Who's Involved

↳ Why This Matters

Key facts

Frequently asked questions

+ What is the CyberGym leaderboard?

+ Why are Anthropic's models offline?

+ What is 'Patch the Planet'?

+ What kind of cyber tasks are evaluated?

What Happens Next

Get the newsletter.

How It Developed

Sources

Related Stories