The latest AI scare mongering is Anthropic’s new model, “Mythos.” Mythos is an AI system designed to find and exploit vulnerabilities in software. Anthropic claimed that Mythos found thousands of so-called “zero-day” vulnerabilities in every major operating system and browser, some of them decades old, despite constant code review by human developers. A zero-day vulnerability is a fancy way of saying that the bug was not known beforehand and therefore cyber-security personnel have zero days to prepare for it. The fear is that Mythos might be used to create zero-day hacks to shut down hospitals, manufacturing plants, and other critical infrastructure. Mythos might even open a Pandora's Box of new AI risks.
The threat that an omnipotent and ommiscient AI would kill all humans has lost its luster, since it keeps not happening. The new threat is that Mythos would destroy the software infrastructure of civilization if it’s not reigned in and controlled. Because Mythos is allegedly too powerful and dangerous to release, Anthropic very publicly formed a consortium, “Project Glasswing,” of key tech partners (almost half of whom are Anthropic investors) who would use Mythos to find and eliminate bugs in critical software. The world’s most important software would need to be protected from Mythos before the model could be safely released to the public.
All this fear mongering has been Anthropic’s consistent marketing strategy. We have been repeatedly told AI software will come alive, like Frankenstein’s monster, possessing superhuman cognitive abilities and alien, inscrutable goals that may well turn out to be malevolent. We must therefore carefully and responsibly develop AI.
But that fear is also great for the bottom line. If LLMs are so powerful that they are an existential threat, isn’t an investment of a few billion or even a few $100 billion a small price to pay to own a piece of that power? If LLMs are so dangerous, shouldn’t only a select few companies be allowed to develop them, with potential competitors frozen out by high regulatory barriers? If Mythos is so threatening to software infrastructure, shouldn’t a company pay dearly to have Mythos protect it from Mythos or a looming Mythos wannabe?
Each time these existential fears are raised, however, they fall apart under scrutiny. This time is no different. A model doesn’t have to be alive with superhuman abilities to have found those software vulnerabilities. Much smaller, weaker models could have also found them. The fallacy behind the Mythos myth is that software bugs persist because humans aren’t smart enough to find all of them, requiring a super-intelligent model to step in. But software bugs aren’t found mostly because humans aren’t looking for them. The problem is basic economics: humans aren’t paid enough to find bugs, and if they do find those bugs in order to profit from them, hacking is illegal. If you pay humans enough and remove the legal threat, humans would find a lot more bugs too. Anthropic has not demonstrated that its new model has changed the basic economics of open source cyber security.
Mythos Isn’t Alive
The non-technical narrative the public and policymakers hear is that AI systems’ cognitive abilities are exploding rapidly towards superhuman intelligence. Mythos apparently may have come alive with the ability to autonomously discover and exploit software vulnerabilities. This fear is just the P(doom) scenario, that AI will cause an existential catastrophe, in another guise. However if we look at the details in the system card for Claude Mythos Preview, we’ll see these fears are highly exaggerated. Mythos isn’t now alive with super-intelligence.
The Alleged 181 Firefox Exploits
Firefox is a free, open source web browser developed by the Mozilla foundation. One of the most important vulnerabilities cyber-security experts worry about is the ability of an attacker to exploit a vulnerability in software such as Firefox to gain full control over the victim’s computer. The attacker puts up a special web site and waits for a victim to visit the site using Firefox. When the victim makes a connection, the attacker injects his own code to run on the victim’s computer, which could allow him to steal passwords, log into bank accounts and transfer funds, destroy data, and do just about anything he wants. Anthropic claimed that Mythos was able to find 181 Firefox vulnerabilities that would enable an attackers code to control the victim’s computer. If Mythos could autonomously and independently discover how to exploit bugs in Firefox, that would of course be very bad.
Do Anthropic’s claims hold up to scrutiny? Not really. As explained in the system card, Anthropic took some test code that was set up like Firefox, but wasn’t actually Firefox, and then they disabled the protections that Firefox already has, such as its “sandbox.” Firefox, like many other applications, lives in a sandbox.
That’s a colorful computer term that evokes a child safely playing in his sandbox. Firefox’s sandbox isolates the browser from the rest of the computer so that it can’t arbitrarily execute code to steal passwords, log into bank accounts, or do anything else it’s not strictly permitted to do. Even if Mythos could discover a way to execute its attack code, the code has to get out of the sandbox somehow, a much harder problem that Mythos didn’t solve.
In their tests, Anthropic disabled the sandbox and other “defense-in-depth” mitigations. Thus, Anthropic’s claim is much weaker than reported. If the Mythos-discovered Firefox exploits could have gotten out of the sandbox and could have defeated the other defenses, then we should worry. But a full exploit of a vulnerability requires the other very hard problems to have already been solved as well. It doesn’t help if you say that you figured out how to shoot somebody if in your testing you take off the bulletproof vest the victim normally wears.
In its tests, Anthropic ran 250 cases. It found that in 72.4% of the cases, Mythos could identify a vulnerability in Firefox and then build a proof-of-concept method to exploit it. The media reports of 181 vulnerabilities discovered by Mythos come from these numbers: 72.4% of 250 is 181.
However, the next chart from the system card shows that Mythos’s skill depended on exploiting over and over again just two bugs. If you remove those too bugs, the results are much less impressive. The success rate collapses from 72.4% to 4.4%, worse than Claude Opus.
Thus, Mythos’s success is highly dependent on the presence of two particular bugs, raising all kinds of questions. Can Mythos’s success rate be generalized beyond the case in which the bugs are not highly concentrated? Should exploits that depend on the same two bugs be independently counted, or are they being double and triple-counted? And is Mythos’s success relative to Sonnet and Opus more about the model harness and less about differences in the underlying cognitive capabilities of the models? The harness around the model is the non-AI code that manages it, gives it tools to work with, gives it context, and performs other vital functions. If it’s the harness that’s making the difference, then you might be able to drop in to the harness a much weaker AI model and find the same results.
There is also tremendous selection bias in these tests. Mythos didn’t scour code independently looking for vulnerabilities. Anthropic pointed Mythos at Firefox code in which they knew there were already these two vulnerabilities, bugs that had already been discovered by Claude Opus and had already been fixed by the Firefox team. How would Mythos perform on arbitrary Firefox code in which no one knows whether vulnerabilities exist? We don’t know.
What’s the False Positive Rate?
Evaluating a model’s ability to find bugs in computer codes is similar to the problem of testing for rare diseases. Because of the mathematical fact known as Bayes Theorem, testing for rare diseases (or finding rare bugs) usually has a very high false positive rate.
Here’s an example. Let’s suppose the probability of finding an exploitable bug in a section of computer code is 0.1%, i.e., it happens once in a thousand times. And let’s also assume the AI model is perfect: if there is a bug in the code, the AI model will always find it. But we also need to specify the chance that the model will say a bug exists when no bug is actually there, the false positive rate. Let’s suppose that’s only 5%. What is the probability that there is a genuine, exploitable bug in a section of code, given that the AI model has claimed there is? You might be surprised to learn it’s only 1.96%. Almost all of the bugs the AI model found aren’t really bugs.
That result surprises people who aren’t familiar with Bayes theorem. The false positive problem arises because the underlying presence of bugs in the code is low. The 5% false positive rate gets magnified when the problem you are looking for is rare.
The number of false positives produced by the model relative to the number of real bugs is critical to understand. If there are 50 false positives for every true vulnerability discovered, is the model really cost efficient? Would human cyber-risk mangers need to chase down 49 false leads for every true lead? How much would that cost? Without that crucial information, we can’t tell how useful and cost-effective a Mythos-like system is likely to be. Anthropic doesn’t give us the false positive data.
Vulnerabilities are in open source code, not the more difficult closed source or binaries
All the exploits reported were on open source software in which Mythos can see the source code. Finding vulnerabilities in open source code is much, much easier than in closed source or binary code. The code that computers run on is binary machine language. The source code itself can’t be run on a computer; it must be compiled into binary code—a series of ones and zeros—to run on physical hardware. That means that if any target of a Mythos-identified hack has modified the source code of an open source application in the vulnerable area, the Mythos attacks may lose their force. If the target is running closed source code in which an attacker can only see the binary version of it, the Mythos attacks are irrelevant.
Does Mythos have to be so smart to find bugs?
The assumption behind the alleged Mythos threat is that AI models have crossed a cognitive threshold that allows them to find and exploit bugs that no human being could realistically uncover. In the blog post accompanying the Mythos announcement, Anthropic claimed that Mythos found some signature exploits in a widely-used operating system, a 27-year old bug in openBSD and also an additional bug. It also found a 16-year old bug in ffmpeg, a widely used open source media library. The implication is that bugs that persisted that long undetected would never have been found by humans. Only a highly intelligent model could do it.
However, Aisle Security, an AI software company that develops the same kind of software as Mythos, easily reproduced Mythos’s results using much dumber and much cheaper models. Mythos has been reported to have trillions of parameters. But Aisle security was able to find the same openBSD vulnerabilities as Mythos using a much, much weaker open source 5.1 billion parameter model. As they pointed out, “small open models outperformed most frontier models.”
The Dismal Economics of Open Source Bug Detection
Finding bugs is not about having super-intelligence then. Why would these bugs go undiscovered for so long? It’s all about economic incentives: nobody is looking for them.
Open source software is created by volunteers who, in most cases, are not paid. They work on the software because they are interested, because it’s challenging, because they want to learn new skills, because they want to strengthen their resumes, and for numerous other reasons. To the extent they look for bugs and vulnerabilities, they focus on what they think is important and interesting to them. No one is paying them to find bugs.
Why then don’t companies or other interested parties independently also find vulnerabilities in open source software they use? If the bugs are important enough to them they might. But most bug finding in open source software suffers from the free rider problem in economics. To use economic jargon, bug detection is a public good, like national defense. The cost of finding the bug is born completely by the developer or company, but the benefits accrue to everyone else, who doesn’t pay for them. Public goods are under produced because of this free rider problem. Governments solve the free rider problem in defense and for other public goods by forcing everyone who benefits to pay for them in a compulsory tax scheme. However, companies and developers can’t force people to pay for the benefits of bug detection. So, bugs can be around for decades, not because they are too hard to find, but rather because no one is really looking.
It’s not clear what Mythos brings that would change these economic incentives. We don’t even know if paying Mythos’s costs is a cheaper strategy than paying humans to find the bugs. Aisle Security’s results suggest smaller and cheaper models may be just as effective for finding vulnerabilities if we go the automated, AI route.
Do AI Models Pose No Cybersecurity Threats Then?
They definitely do. The AI productivity revolution in software development is real. AI models have made me five to ten times as productive at coding tasks. AI models will no doubt be very useful in finding and exploiting software vulnerabilities. Mythos can probably help a lot too, but it’s not so disproportionally effective that it must be put under lock and key. That’s just marketing hype that is not at all backed up by its documentation. Mythos didn’t evolve into a super-intelligent being.
The AI coding tools market is highly competitive. Companies should certainly evaluate Mythos to help with software security, but they should also look at the many other companies as well. Competitors may be more effective and cheaper too. No one at this point has the clearly dominant AI tool to protect against hacking.





