Fable 5 and Mythos 5 remain suspended: “The ball is in Anthropic’s court”
On Friday evening, Anthropic suddenly disabled its new flagship models, Fable 5 and Mythos 5, after the U.S. government became aware of a way to perform a specific jailbreak on Fable 5 and put it under an export control order. Anthropic claims the vulnerability is minor, but White House AI czar David Sacks and Amazon's involvement escalate the situation, setting a precedent for AI safety regulation.
On Friday evening, Anthropic suddenly disabled its new flagship models, Fable 5 and Mythos 5, after the U.S. government became aware of a way to perform a specific jailbreak on Fable 5 and put it under an export control order. Since this order applies to all foreigners, including those in the U.S., Anthropic had no other choice but to disable these models for everybody.
As of now, it is unclear what this jailbreak entailed, and Anthropic argues that what the government showed were “minor vulnerabilities” that “all appear relatively simple,” and that don’t go beyond the capabilities of other publicly available models.
When Anthropic announced Fable 5 and Mythos 5, it noted that Fable 5 had undergone extensive red-teaming security exercises with the help of the UK’s AI Security Institute and other external testers. Anthropic’s own internal testing showed that the model would complete about 5% of adversarial cyber tasks.
The Fable 5 model card also specifically notes that, “in the event that a public universal jailbreak is found, we will move quickly to update our defenses to ensure that they remain robust to all known attacks.” But according to the current information, this current issue isn’t about a universal jailbreak but applies to a very specific problem.
As of Saturday morning, Anthropic hasn’t updated its previous statement, which concluded that all of this “is a misunderstanding.”
More than a misunderstanding?
Given that this is 2026, the story gets more complicated, though. David Sacks, the co-chair of the President’s Council of Advisors on Science and Technology and the White House’s former AI and crypto czar, on Saturday tweeted the U.S. government’s version of events.
Sacks argues that “a highly credible trusted partner of both Anthropic and [the U.S. government]” reported the jailbreak and that the administration asked Anthropic CEO Dario Amodei to improve the guadrails to fix the jailbreak or take the model down. “Dario refused,” Sacks writes.
Amazon’s role
According to independent reports from the Wall Street Journal and The Information, it was Amazon CEO Andy Jassy who reported a jailbreak that Amazon researchers found to, according to the Wall Street Journal, “U.S. officials, including Treasury Secretary Scott Bessent.”
Those Amazon researchers, the report says, found ways to get Fable 5, the version of Mythos 5 with security guardrails, to aid in cyberattacks. Anthropic, when it released Fable 5, noted that it had put guardrails in place to prevent Fable 5 from aiding users in starting cyberattacks or creating bioweapons, for example.
Indeed, many users quickly complained that the model refused to answer innocuous questions. Often, when the system detected potentially unsafe prompts, Claude would also quietly move to using the former flagship model, Opus 4.8.
Since this jailbreak was reported by Amazon, chances are those researchers tested Fable 5 on Amazon Bedrock, which Amazon says has the same safety mechanisms in place as using Claude through Anthropic directly.
Sacks argues that Anthropic defended its position not to take the model down “by saying the jailbreak isn’t serious” and pulls a rhetorical move that puts Anthropic into a corner of its own making.
“That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company,” he writes. “It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not ‘serious.’”
As many a pundit has pointed out since this story broke, it was Anthropic that argued that Mythos 5 was too dangerous to release to the public. It’s also Anthropic that has built a brand on being the frontier lab that takes AI safety seriously.
Now Sacks can turn this against the company and writes, “In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety.”
What’s next?
The most obvious solution here is for Anthropic to put new guardrails in place that would make this specific jailbreak impossible — though given the nature of these non-deterministic models, some other jailbreak may just be around the corner.
Chances are, though, that we’ll see a fix relatively soon and that the export control will be lifted and the model becomes available again.
This does, however, set a new precedent for how the U.S. government could handle AI safety and the other U.S.-based frontier labs are surely watching this very closely. The way AI has progressed has been a constant back-and-forth between these labs, after all, one besting the other on a regular basis — and Fable 5/Mythos 5 isn’t likely to be the pinnacle of AI model development.
What this means for the next tranche of models from OpenAI and Google remains to be seen. The U.S. government has, after all, proposed voluntary safety tests before a new model could be released and this affair will likely put this idea to the forefront again.
Anthropic, it is worth noting, has been the company that has advocated for AI regulations more than anybody.
“The ball is in Anthropic’s court”
All of this, of course, is complicated by the contentious relationship between Anthropic and the Trump White House.
Sacks, in his tweets, argues that this is not the case here and that “the Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.”
The post Fable 5 and Mythos 5 remain suspended: “The ball is in Anthropic’s court” appeared first on The New Stack.