AI News HubLIVE
站内改写4 分钟阅读

待翻译:Anthropic Expands Public Access to Claude Mythos AI Model

AI 服务暂时不可用,以下为来源摘要,待恢复后补全翻译:Artificial Intelligence & Machine Learning , Next-Generation Technologies & Secure Development , The Future of AI & Cybersecurity Anthropic Expands Public Access to Claude Mythos AI Model Expect to See Widespread Availa…

来源Hacker News AI作者: divija_07

AI 服务暂时不可用,以下为来源正文,待恢复后补全翻译。

Artificial Intelligence & Machine Learning , Next-Generation Technologies & Secure Development , The Future of AI & Cybersecurity Anthropic Expands Public Access to Claude Mythos AI Model Expect to See Widespread Availability of Mythos-Level Models Within 6-12 Months Mathew J. Schwartz (euroinfosec) • May 26, 2026 Image: Shutterstock More companies will get a shot at accessing Claude Mythos, the bug-hunting artificial intelligence model the company has touted as too dangerous for general release. See Also: How AI Increases the Risk of Enterprise Data Exposure Anthropic has restricted access to its Mythos large language model through its Project Glasswing, which counts about 50 carefully curated partners, including technology giants Cisco, Oracle and Microsoft (see: Anthropic Calls Its New Model Too Dangerous to Release). "We will work with critical partners - including U.S. and allied governments - to expand Project Glasswing to additional partners. And in the near future, once we've developed the far stronger safeguards we need, we look forward to making Mythos-class models available through a general release," the company said Friday. Mythos is notable because the LLM doesn't just find vulnerabilities in software at an unprecedented level, but can also deliver working exploit chains, sometimes combining low-severity flaws into a high-severity threat. These capabilities led Anthropic to restrict access, to give some of the world's biggest software vendors and open-source projects a head start at finding and patching flaws, before threat actors gain the ability to do likewise, including through other emerging LLMs (see: Zero Days for the Masses: Mythos Presages Exploit Tsunami). The company said it's just a matter of time before attackers have equivalent tools. "We believe that Mythos-level models will become widely available in the next 6-12 months," Anthropic said on Friday. Few Mythos-discovered flaws have been fully detailed in public, and even then, typically only as a credit in new CVE release notes. "Disclosed vulnerabilities are a lagging indicator of the accelerating frontier of AI models' cyber capabilities: we're not yet at the point where we can fully detail our partners' findings with Mythos Preview without putting end users at risk," Anthropic said. What Anthropic did say is that partner organizations have collectively found "more than 10,000 high- or critical-severity vulnerabilities across the most systemically important software in the world." Of these flaws, 6,202 existed in open-source software. Anthropic said a third party reviewed 1,752 of the high- or critical-rated vulnerabilities, validated that 91% of them were flaws and found that the correct severity rating was applied to two-thirds of them. The real-world threat currently posed by Mythos remains unclear. Britain's AI Security Institute last month reported seeing rapid advances in frontier model performance, including with Mythos. "In controlled evaluations where Mythos Preview was explicitly directed and given network access to do so, we observed that it could execute multi-stage attacks on vulnerable networks and discover and exploit vulnerabilities autonomously - tasks that would take human professionals days of work," the institute found. But the institute didn't simulate a typical enterprise environment with extensive cybersecurity defenses or active defenders. "This means we cannot say for sure whether Mythos Preview would be able to attack well-defended systems," the institute reported. New Benchmarks The direction of travel seems clear: frontier LLMs' ability to exploit vulnerabilities as well as orchestrate attacks is rapidly improving. New benchmarks for measuring the ability of LLMs to exploit vulnerabilities are being developed by academics. These include ExploitBench, from Carnegie Mellon University and BugCrowd, and ExploitGym, developed by UC Berkeley, the Max Planck Institute for Security and Privacy, University of California-Santa Barbara and Arizona State University, with input from security researchers at security researchers at Anthropic, Google and OpenAI. Glasswing participant Cloudflare said one of its big Mythos takeaways so far is that treating the model not as a "chat interface," but rather as a vulnerability discovery harness, led to much better results, as did working in more bite-sized pieces. The company's approach is to issue very narrow instructions, to use a second agent to check the work of the first and to ask many different agents separate questions in parallel pertaining to different parts of the attack chain. "Coverage improves when many agents work on tightly scoped questions and we deduplicate the results afterward, rather than asking one agent to be exhaustive," said Cloudflare CSO Grant Bourzikas in a May 18 blog post. Anthropic said the new tools it's offering to vetted security teams will include these types of "custom instructions for repeat work" - developed with Project Glasswing participants - as well as "a harness that helps Claude map the codebase, spin up scanning subagents, triage its findings and write reports." Also included is a "threat model builder" that scans a code base to identify likely attack targets, to help researchers prioritize their efforts. Cloudflare said one upside of working with Mythos has been the "clear improvement" in the LLM's ability to chain vulnerabilities into a working proof-of-concept exploit, which helps lower the time required to remediate the flaw. "A finding that arrives with a PoC is a finding you can act on, and it means far less time spent asking 'is this even real?'" Bourzikas said. Big questions loom. Code maintainers already report seeing a deluge of bug reports, including extensive duplication thanks to AI-assisted vulnerability researchers using the same LLMs. Microsoft-owned GitHub earlier this month tweaked its bug bounty program in response to "a sharp increase in submissions that don't demonstrate real security impact" and which too often fail to include a proof-of-concept exploit. "This isn't unique to GitHub. Programs across the industry are grappling with the same challenge, and some have shut down entirely," it said. While the firm said it is seeing an increase in "legitimate reports" for which it still offers cash rewards, to cope with "submissions that don't demonstrate significant security impact but do result in a code or documentation fix," the rewards will now involve only "GitHub swag." Another challenge is speed: As more vulnerabilities get found and reported, and software developers issue more patches, end-user organizations must test and roll out these fixes. Vulnerability management experts said theoretical limits exist pertaining to how quickly this can be done, at least in an effective manner. Given such challenges, Cloudflare said "what the architecture around the vulnerability should look like" remains an unanswered question. "The principle is to make exploitation harder for an attacker even when a bug exists, so that the gap between when a vulnerability is disclosed and when it is patched matters less," Bourzikas said. What that looks like in practice remains under development.