A New Competitor for Fable 5 and Mythos Preview: Sakana's Fugu Ultra Model
Sakana AI releases Fugu Ultra, a multi-agent orchestration model that matches frontier performance of Anthropic's Fable 5 and Mythos Preview on engineering, science, and reasoning benchmarks, while avoiding single-vendor dependency and export control risks. Fugu is a language model that dynamically orchestrates a pool of models, including itself, to handle complex multi-step tasks via a single API. Two versions are available: Fugu for balanced performance and low latency, and Fugu Ultra for maximum accuracy. Early users report strong results in code review, security analysis, and automated research.
-->
Sakana AI Releases ‘Fugu Ultra’ to Match Frontier Performance via Autonomous Model Orchestration. Our Fugu Ultra model stands shoulder-to-shoulder with leading models like Anthropic’s Fable 5 and Mythos Preview across the industry’s most rigorous engineering, scientific, and reasoning benchmarks while delivering frontier capability without the risk of export controls.
(*日本語は英文の後に)
We are excited to introduce Sakana Fugu, a new product from Sakana AI that delivers a full multi-agent orchestration system as a single foundation model. Fugu dynamically orchestrates the world’s best models to tackle complex, multi-step tasks, accessible through a single model API. The result is multi-agent intelligence delivering the very best frontier-level performance without any single-vendor dependency or the complexity of a traditional multi-agent system.
👉 Sakana Fugu
Sakana Fugu is itself a language model trained to call various LLMs in an agent pool, including instances of itself recursively. Fugu dynamically orchestrates the world’s best models to tackle complex, multi-step tasks. Plug collective intelligence directly into your workflows today with a single API.
Beyond Bigger Models: Orchestration Models are the Next Frontier
For the past few years, progress in AI has been driven largely by brute-force scale: building giant, monolithic models trained on ever-larger amounts of data. But hard, real-world tasks require a multitude of specialized knowledge and skills, far beyond any individual benchmark. Unlocking the very best performance therefore requires collective intelligence: knowing which model to use, delegating tasks such as planning and execution, and combining domain-specific strengths while routing around individual weaknesses.
Since our founding, Sakana AI has been guided by a core conviction: the most powerful AI systems will not be isolated monoliths, but collaborative ecosystems. Evolution innovates under constraints, and the future belongs to systems that explicitly learn how to coordinate collective intelligence.
Today, this orchestration is no longer just a technical optimization; it has become a geopolitical and operational imperative. Recent disruptions in the AI landscape have demonstrated the severe risk of single-vendor dependency. For an organization or a nation, relying on a single company’s APIs for critical infrastructure, finance, or governance is a material vulnerability. This risk is no longer a hypothetical possibility, but a reality. As we have seen recently from export controls imposed on Anthropic’s Fable and Mythos models, access can shift or disappear overnight due to changing regulatory boundaries, export controls, and foreign policies.
Collective intelligence serves as the practical hedge against this concentration of power. Sakana Fugu is powered by models trained to be powerful orchestrators with an underlying pool of entirely swappable agents. If a single provider restricts access, Fugu dynamically routes around the disruption. Over time, Sakana Fugu will naturally grow by incorporating newer, more efficient models, including our own. By orchestrating the world’s models, we are delivering the realistic, resilient blueprint required for AI sovereignty.
What Is Sakana Fugu?
Sakana Fugu is a multi-agent system that behaves like a single model. You send a request to one endpoint, and Fugu decides how to handle it: solving it directly when that is enough, or assembling and coordinating a team of expert models when a task calls for more. It manages model selection, delegation, verification, and synthesis internally, so the complexity of a multi-agent system never reaches your code.
What makes this possible at scale is that Fugu is itself a language model specialized to understand when to delegate, how agents should communicate, and how to combine their work into a single, reliable answer. This approach builds on our research on learned model orchestration, including our recent ICLR 2026 papers Trinity and the Conductor. From the outside, you simply call one model. On the inside, a coordinated system of experts is doing the work.
Fugu and Fugu Ultra
At launch, Sakana Fugu comes in two models, so you can match the system to your workload. Both models can be accessed via a single OpenAI-compatible API.
Fugu balances strong performance with low latency, making it a great default for everyday work. It fits naturally into tools like Codex for coding and code review, as well as chatbots and other interactive services. For teams with data, privacy, or compliance requirements, Fugu also lets you opt specific agents out of its pool.
Fugu Ultra is tuned for maximum answer quality on hard, multi-step problems, coordinating a deeper pool of expert agents when accuracy and depth matter most. Early users have relied on it for demanding work such as AI research, paper reproduction, cybersecurity analysis, and literature and patent investigations.
Here is how the two models perform across standard benchmarks:
Our Fugu Ultra model stands shoulder-to-shoulder with leading models like Fable 5 and Mythos Preview across the industry’s most rigorous engineering, scientific, and reasoning benchmarks. It delivers frontier capability without the risk of export controls.
Performance comparison of Fugu models and baseline frontier models across a suite of coding, reasoning, scientific, and agentic benchmarks. All scores other than Fugu’s are reported by the model providers. For Fable 5 and Mythos Preview, we report the max of the two if both scores are available on the same benchmark. Neither of them is in Fugu’s agent pool as they are not publicly accessible. For more details, please refer to our technical report.
Benchmark results comparing Fugu with underlying foundation models used by Fugu, where highest scores are in boldface and the second highest are underlined:
*We use the mini-swe-agent as the scaffolding for this task.
†We use model provider-reported scores for the baselines.
What Early Users Are Building
Benchmarks tell only part of the story. Fugu’s value shows up most clearly in long, messy, real-world workflows, which is exactly what we focused on during our beta program with close to 500 early users, whose feedback helped us improve the system.
Applications of Fugu Models. In our experiments, we find that Fugu Models consistently outperform frontier models Gemini 3.1 Pro (high), Opus 4.8 (max), and GPT 5.5 (xhigh) for various applications, such as AutoResearch, Rubik’s Cube, Mechanical Design, Japanese Handwriting Analysis, One-Shot Chess, Financial Time Series Prediction.
One of the clearest signals came from automated data science research: early users running Sakana Fugu in an almost fully automated research mode saw it drive meaningful progress with little to no human intervention. For us, this is exactly the kind of task Fugu Ultra is designed for: open-ended, multi-step work where the system needs to explore ideas, run experiments, interpret failures, revise its approach, and keep making progress over time.
Here is what other users are saying:
“For code review, Fugu Ultra is significantly better than GPT-5.5. It gives comprehensive answers and finds the bugs others miss. Where other tools flag about three issues, Fugu surfaced more than twenty. It's become the model I run all my reviews through.”
— Software Engineer, on Coding and Code Review
“Raw output quality is on par with top frontier models, but Fugu showed unusually strong persona stability across long sessions, holding its identity where other models drift. For agent products, that may matter more than raw benchmark scores.
— Executive at Enterprise Platform Company, on Orchestration Quality
“Given one scoped instruction, Fugu drove a full security assessment end-to-end — recon, XSS/SQLi checks, auth review, and a clean report with evidence and retest steps — staying inside scope and avoiding destructive actions.”
— Cyber Security Engineer, on Security Assessment Analysis
We saw similar patterns across paper reproduction, cybersecurity analysis, code review, and literature and patent investigations. In these workflows, the value of Fugu is not just a better answer to one prompt, but sustained progress across many steps: reading, implementing, testing, comparing evidence, finding gaps, and producing a useful final analysis or report. The beta made clear that multi-agent orchestration matters most when the task is messy, long-running, and difficult to solve with a single model call.
Sakana Fugu is generally available today. You can access both Fugu and Fugu Ultra through a single API, with subscription tiers for everyday use and a pay-as-you-go plan for heavier and enterprise workloads. To get started, visit our product page or console site.
Looking Ahead
We are deeply grateful to our early users who put Fugu through real, demanding work and helped us shape what it is today. This launch is a starting point, not a finish line. Because Fugu is built on learned orchestration rather than fixed workflows, it improves as the underlying ecosystem improves: as new frontier models arrive, we can fold them into Fugu’s agent pool and pass the gains on to you. In the months ahead, we plan to expand the pool of expert agents, including open models and Sakana AI’s own models, to strengthen coordination for long-running and agentic tasks, and give users more control over how Fugu works on their behalf. We are excited to see what you build with it.
We are looking for people to help shape the future of AI together with Sakana AI. Please see our careers page.
Publications
Sakana Fugu Technical Report, Fugu Team, Sakana AI, 2026.
Xu, Sun, Schwendeman, Nielsen, Cetin, Tang. TRINITY: An Evolved LLM Coordinator. ICLR 2026.
https://arxiv.org/abs/2512.04695
Nielsen, Cetin, Schwendeman, Sun, Xu, Tang. Learning to Orchestrate Agents in Natural Language with the Conductor. ICLR 2026.
https://arxiv.org/abs/2512.04388
Japanese
Sakana Fugu:マルチエージェントシステムを、一つのモデルAPIとして提供
Sakana AI、自律的なモデルオーケストレーションでフロンティア性能に並ぶ「Fugu Ultra」を提供開始 Fugu Ultraは、エンジニアリング・科学・推論といった業界屈指の厳しいベンチマークにおいて、AnthropicのFable 5やMythos Previewといった最先端モデルに比肩します。しかも輸出規制のリスクを負うことなく、フロンティアレベルの能力を発揮します。
Sakana AIは、マルチエージェントのオーケストレーションシステムを一つの基盤モデルとして提供する新プロダクト「Sakana Fugu(サカナ・フグ)」の提供を開始します。Sakana Fuguは、最高性能のモデル群を動的にオーケストレーションして複雑で多段階のタスクに取り組むシステムであり、単一のモデルAPIから利用できます。これにより、一つのベンダーに依存することなく、また自身で複雑なそうしたシステムをつくることなく、フロンティアレベルの性能を備えたマルチエージェントの能力を利用できます。
👉 Sakana Fugu
Sakana Fugu自体が一つの言語モデルであり、エージェントプール内のさまざまなLLMを呼び出すように学習されている。そこでは自分自身を再帰的に呼び出すこともある。Sakana Fuguは、最高性能のモデル群を動的にオーケストレーションし、複雑で多段階のタスクに取り組むことで、その集合知を一つのAPIですぐにワークフローに組み込むことを可能にする。
スケーリングの先へ:次のフロンティアとしてのオーケストレーションモデル
この数年、AIの進歩は主にスケールの追求、すなわち巨大で一枚岩のモデルをますます大量のデータで学習させることによって牽引されてきました。しかし、現実世界の難しいタスクでは、単一のモデルを一度呼び出すだけで最良の結果が得られることはほとんどありません。どのモデルを使うか、いつ処理を委譲するか、途中の作業をどう検証するか、そして個々のモデルの弱点を避けつつ、それぞれの強みをどう組み合わせるか。AIの最先端の能力は、こうした複数モデルの集合知をいかに活用するかに関する判断の積み重ねによって引き出されます。
Sakana AIは創業以来、たった一つ大きなモデルではなく、複数のモデルが協調するエコシステムをつくることで最も強力なAIシステムが実現できるという考え方を大切にしてきました。生物進化が様々な制約のもとで新たな解を見つけてきたように、集合知をどう協調させるかを自ら学習するシステムがこれからは重要になると考えています。
こうしたオーケストレーションは、技術的に理にかなったアプローチであるだけではなく、いまや地政学的にも、実務面でも、避けて通れない技術になっています。近年のAIをめぐる動向は、単一ベンダーへの依存が抱える深刻なリスクを浮き彫りにしました。組織にとっても国家にとっても、重要インフラや金融、行政を一社のAPIに頼って動かすことは、現実的な弱点になり得ます。そしてこのリスクは、もはや仮定の話ではなくなっています。最近のAnthropicのFable 5およびMythos 5モデルに課された輸出規制に見られたように、規制の枠組みや輸出管理、各国の政策が変われば、アクセスの条件は一夜にして変わり得ます。
集合知によるアプローチは、このような特定のプレイヤーへの集中に対する、現実的な備えにもなります。Sakana Fuguはオーケストレーションのためのモデルとして学習させたものであり、その背後で用いるモデル群は、必要に応じて柔軟に入れ替え可能です。仮にあるプロバイダーが利用を制限しても、Sakana Fuguはその影響を動的に迂回します。今後は、より新しいモデルや、Sakana AI自身のモデル、その他のオープンモデルも、随時プールに加えたり、入れ替えたりしていく予定です。世界中のモデルをオーケストレーションすることで、AI主権(AI sovereignty)を支える、現実的で確かな選択肢を示していきたいと考えています。
Sakana Fuguとは
Sakana Fuguは、単一のモデルのように振る舞うマルチエージェントシステムです。ユーザーが一つのエンドポイントにリクエストを送ると、Sakana Fuguがその処理方法を判断します。単独モデルで十分な場合はそのまま解き、より高度な対応が求められる場合には専門モデルのチームを編成して連携させます。モデルの選択、委譲、検証、統合をすべて内部で管理するため、マルチエージェ
[truncated for AI cost control]