2026-06-23 12:30 UTCIn-site rewrite5 min readUpdated: 2026-06-23 14:06 UTC

Sakana Fugu: Multi-Agent System as a Model

Sakana AI's Fugu packages multi-agent orchestration into a single model API, hiding the complexity of coordinating specialized agents behind a standard LLM interface. Developers can trigger delegation, verification, and synthesis with one API call, simplifying production AI workflows.

SourceAnalytics VidhyaAuthor: Harsh Mishra

Article intelligence

EngineersAdvanced

Key points

Fugu operates as a multi-agent system that looks like a single model from the outside. It handles agent selection, role assignment, coordination, and verification internally.
Two variants are available: Fugu for everyday tasks balancing quality and latency, and Fugu Ultra for high-stakes reasoning and research.
Pricing avoids stacking fees across multiple agents; pay-as-you-go and subscription plans are offered.
Benchmarks show strong performance on coding, reasoning, and agentic tasks, but Ultra is not always superior; developers should test on their own workloads.

Why it matters

This matters because fugu operates as a multi-agent system that looks like a single model from the outside. It handles agent selection, role assignment, coordination, and verification internally.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

This panel is AI-generated and reviewed for accuracy.

-->

Sakana Fugu: Multi-Agent AI Orchestration in a Single Model

India's Most Futuristic AI Conference Is Back – Bigger, Sharper, Bolder

Career

GenAI

Prompt Engg

ChatGPT

LLM

Langchain

RAG

AI Agents

Machine Learning

Deep Learning

GenAI Tools

LLMOps

Python

NLP

SQL

AIML Projects

Reading list

How to Become a Data Analyst in 2025: A Complete RoadMap

A Comprehensive Learning Path to Tableau in 2025

A Comprehensive NLP Learning Path 2025

Learning Path to Become a Data Scientist in 2025

Step-by-Step Roadmap to Become a Data Engineer in 2025

A Comprehensive MLOps Learning Path: 2025 Edition

Roadmap to Become an AI Engineer in 2025

A Comprehensive Learning Path to Master Computer Vision in 2025

Best Roadmap to Learn Generative AI in 2025

GenAI Roadmap for Enterprises

Large Language Models Demystified: A Beginner’s Roadmap

Learning Path to Become a Prompt Engineering Specialist

Sakana Fugu: Multi-Agent System as a Model

Harsh Mishra Last Updated : 23 Jun, 2026

8 min read

For years, AI progress has centered on scaling individual foundation models: larger parameters, longer context windows, stronger reasoning, and better tool use. Sakana AI’s Fugu points elsewhere, behaving like one model from the outside while coordinating multiple expert agents internally.

A single API call can trigger direct answering, specialist delegation, intermediate verification, and final synthesis, hiding orchestration complexity behind a normal LLM interface. In this article, a practical guide to Fugu’s architecture, variants, pricing, benchmarks, access, code, tests, enterprise fit, trade-offs, and use cases.

Table of contents

What is Sakana Fugu?

Why the naming matters

Why Multi-Agent System as a Model Matters

Sakana Fugu Release Overview

Fugu vs Fugu Ultra

Fugu

Fugu Ultra

Comparison table

Architecture: How Fugu Works Internally

Core architecture components

Pricing

Benchmark Results

Technical Hands-on: Using Sakana Fugu API

Conclusion

What is Sakana Fugu?

Sakana Fugu is an OpenAI-compatible managed model API that looks like a single LLM but works as a multi-agent system internally. Developers send a prompt to one model ID, such as fugu or fugu-ultra, while Fugu handles agent selection, role assignment, coordination, verification, and final response.

Instead of manually building planner, coder, reviewer, researcher, or supervisor agents with frameworks like LangGraph, AutoGen, or CrewAI, teams get orchestration packaged into the model itself. This reduces the need to manage prompts, routing, retries, memory, state, monitoring, and failure recovery.

Why the naming matters

The name “Sakana” means fish in Japanese. The company often frames its research around collective intelligence, similar to how a school of fish can behave as one coordinated system. Fugu follows that idea. Many agents coordinate behind one interface.

Why Multi-Agent System as a Model Matters

Most production AI systems today fall into one of three patterns:

Single-model prompting

Tool-augmented LLM applications

Manually designed multi-agent workflows

Single-model prompting is simple, but it can fail on complex tasks that require planning, execution, verification, and iteration.

Tool-augmented LLMs improve usefulness by connecting models to search, databases, code execution, APIs, or business systems. But the model still usually acts as the central reasoning engine.

Multi-agent workflows go further. They divide work across specialized agents. For example:

A planner breaks down the task.

A researcher gathers context.

A coder writes code.

A reviewer checks for correctness.

A verifier tests the answer.

A supervisor coordinates the process.

This can improve reliability on difficult tasks, but building it well is hard. Teams must answer many system design questions:

Which agent should handle which task?

How should agents communicate?

When should the system stop?

How should intermediate outputs be verified?

How should cost and latency be controlled?

How should failures be recovered?

How should compliance restrictions be applied?

Fugu attempts to make this easier by turning multi-agent orchestration into a model-level capability. The developer does not need to design every agent interaction manually.

Sakana Fugu Release Overview

Sakana Fugu was introduced as Sakana AI’s commercial multi-agent orchestration product. The initial beta positioned it as a system that coordinates pools of frontier foundation models for coding, mathematics, scientific reasoning, research, and complex analysis.

The latest Fugu release makes the product easier to access through Sakana’s console and an OpenAI-compatible API. The core release message is simple: developers can plug multi-agent intelligence into existing workflows without rewriting their application around a new SDK or orchestration framework.

Fugu vs Fugu Ultra

Sakana Fugu comes in two main model options: Fugu and Fugu Ultra.

Fugu

Fugu is the default model for everyday work. It balances performance and latency. It is suitable for coding support, code review, chatbots, internal assistants, document analysis, and interactive workflows where response time matters.

A key point is that Fugu can route to the best model based on the task. It also allows users to opt specific agents out of the model pool, which can help with data, privacy, compliance, or organizational requirements.

Fugu Ultra

Fugu Ultra is optimized for maximum answer quality. It coordinates a deeper pool of expert agents and is intended for hard, high-stakes, multi-step problems. According to the Sakana, Fugu Ultra can route between one to three agents depending on the problem.

Fugu Ultra is better suited for workloads where accuracy, depth, and persistence matter more than latency. Examples include:

Paper reproduction

Kaggle-style data science workflows

Cybersecurity analysis

Literature review

Patent investigation

Deep technical research

Complex code review

Scientific reasoning

Comparison table

Feature Fugu Fugu Ultra

Best for Everyday coding, chat, review, interactive workflows Hard reasoning, research, high-stakes analysis

Design goal Balance quality and latency Maximize quality

Agent pool Flexible, with opt-out support Fixed full pool

Latency Lower Higher

Cost Depends on active underlying agent tier Fixed token pricing

Recommended users Developers, product teams, internal tools Researchers, advanced developers, enterprise analysis teams

Main trade-off Less depth than Ultra Higher cost and response time

Architecture: How Fugu Works Internally

Fugu’s architecture can be understood as a managed orchestration layer wrapped inside a model API.

From the outside, the flow looks like this:

Source: Sakana.ai

Internally, the system is closer to this:

Source: Sakana.ai

Sakana Fugu exposes a single API while internally coordinating a pool of specialized models. The user sends one request, and Fugu handles routing, delegation, verification, and synthesis.

Core architecture components

API gateway

The developer interacts with a standard API surface. This matters because Fugu supports OpenAI-compatible endpoints, so teams can reuse existing OpenAI SDK clients with a different base URL and API key.

Orchestrator model

The orchestrator is the core intelligence layer. It decides how the task should be handled. For simpler tasks, it may answer with minimal orchestration. For complex tasks, it can coordinate multiple expert agents.

Agent pool

Fugu has access to a pool of underlying models or agents. These agents may have different strengths across coding, reasoning, research, long-context analysis, or other specialized tasks.

Dynamic routing

Instead of hardcoding a workflow, Fugu dynamically selects which agent or agents to use. This is important because model strengths are often task-specific. One model may perform better at code generation, another at mathematical reasoning, another at long-context synthesis.

Delegation and communication

The orchestrator can break down a complex task into subtasks. It can send focused instructions to different agents and control what context each agent receives.

Verification

For difficult tasks, the system can use verification-style behavior. One agent may solve, another may critique or validate, and the orchestrator may combine the results.

Synthesis

The final answer is returned as a single response. The user does not see the full internal agent graph. .

Pricing

Fugu has two pricing modes: pay-as-you-go and subscription plans.

Pay-as-you-go

Pay-as-you-go is designed for heavier production workloads. Sakana says consumption-based tokens are served at higher priority than monthly-plan tokens.

Fugu pricing

Fugu pricing depends on the active agent setup.

Active agents Billing rule

1 agent Pay the standard rate for the specific underlying model

Multiple agents Fees are not stacked. You are charged one rate based on the top-tier model involved

This is important because many multi-agent systems become expensive when each model call is billed separately. Fugu’s pricing model tries to avoid stacking model fees across agents.

Fugu Ultra pricing

Fugu Ultra has fixed pricing for fugu-ultra-20260615 per 1M tokens.

Token type Standard price Context greater than 272K

Input $5 per 1M tokens $10 per 1M tokens

Output $30 per 1M tokens $45 per 1M tokens

Cached input $0.50 per 1M tokens $1.00 per 1M tokens

Subscription plans

Subscription plans are designed for individuals and everyday hands-on use. Every tier includes both Fugu and Fugu Ultra.

Plan Price Best for Usage

Standard $20/month Lightweight daily usage, occasional API calls, small experiments Baseline allowance

Pro $100/month Regular coding, review, research, and analysis sessions 10x Standard usage

Max $200/month Heavy long-running workloads 20x Standard usage

Benchmark Results

Sakana reports Fugu and Fugu Ultra benchmark scores across coding, reasoning, science, agentic tasks, long-context reasoning, and cybersecurity-style evaluation.

Source: Sakana.ai

Sakana Fugu and Fugu Ultra compared with frontier baseline models across coding, reasoning, science, long-context, and agentic benchmarks.

Benchmarks are useful, but they should not be treated as direct production guarantees. Fugu’s benchmark profile suggests three practical insights.

Fugu is strongest when tasks require orchestration

The strongest use case is not a simple one-shot answer. The model is designed for tasks that benefit from decomposition, expert selection, verification, and synthesis.

Examples:

Debug this repository.

Review this pull request.

Reproduce this research paper.

Investigate this patent landscape.

Analyze a possible security vulnerability.

Compare multiple technical approaches and recommend one.

Ultra is not always automatically better

Fugu Ultra is optimized for answer quality, but Fugu can outperform it on some benchmarks. Developers should benchmark both models on their own workload before standardizing.

A practical routing strategy could be:

Use fugu for interactive work. Use fugu-ultra for complex, high-value tasks. Fallback to fugu when latency or cost matters.

Multi-agent performance comes with hidden complexity

Even though Fugu hides orchestration complexity from the developer, the underlying system still performs additional work. This can affect latency, cost, and observability.

Teams should monitor:

Total tokens

Orchestration tokens

Latency by task type

Quality by workload category

Failure cases

Model version behavior

Cost per successful outcome

Technical Hands-on: Using Sakana Fugu API

Sakana fugu documentation: https://console.sakana.ai/get-started

1: Create an API key

Go to the Sakana console API key page login and create API: https://console.sakana.ai/api-keys

Create an API key and store it securely. The key is shown o

[truncated for AI cost control]