Key facts
- OpenAI's GPT-5.5 Cyber AI achieved a 71.4% success rate on expert-level advanced cyber tasks.
- Anthropic's Mythos Preview model achieved a 68.6% success rate on the same tasks.
- GPT-5.5 represents a similar level of cyber performance to previous frontier models, indicating a broader trend.
- Anthropic's models are currently offline due to a Trump administration export ban.
- OpenAI has launched 'Patch the Planet' to assist open-source projects in improving their security with AI tools.
OpenAI's GPT-5.5 Cyber AI has demonstrated superior performance in cybersecurity evaluations compared to Anthropic's Mythos model, according to recent assessments. The CyberGym leaderboard shows GPT-5.5 achieving a higher success rate on advanced tasks, including complex reverse engineering and exploitation challenges.
AISI's evaluations found that GPT-5.5 achieved an average pass rate of 71.4% on expert-level tasks, surpassing Mythos Preview's 68.6%. This indicates a broader trend in AI's cyber capabilities rather than a breakthrough specific to one model. The advanced tasks, developed with cybersecurity firms, probe skills like vulnerability research and exploitation against realistic targets.
This development comes as Anthropic's models, including Mythos, are reportedly offline due to an export ban imposed by the Trump administration. OpenAI, meanwhile, is expanding its efforts in cybersecurity AI, offering "trusted access" to its models and launching initiatives like 'Patch the Planet'.
'Patch the Planet,' a collaboration with Trail of Bits, HackerOne, and Calif., aims to help open-source projects improve their security by leveraging AI tools for vulnerability detection and patching. OpenAI is subsidizing usage of its Codex Security scanner for open-source and private code, with over 30 projects already participating.
