AI News HubLIVE
In-site rewrite2 min read

Show HN: AI-whisper – Claude works better when Codex watches its back

AI-whisper is an open-source CLI tool that pairs two coding agents (e.g., Claude, Codex, ezio) in a single terminal session, using structured workflows and a single-baton relay to coordinate collaboration. It supports spec-driven development, bug fixing, and more, with an LLM evaluator gating each round.

SourceHacker News AIAuthor: vuphanse

← projects

v0.7.0 — open source, on npm, actively used personally.

what it does

ai-whisper pairs two coding agents in your terminal — mount any two of Claude, Codex, and ezio, provider-agnostic by design. They share a single baton — one owns the turn at a time — and structured workflows drive the collaboration, with an evaluator gating each round. The relay isn’t tied to any one arrangement; today’s workflows pair an implementer and a reviewer (spec-driven-development, ralph-loop, complex-bug-fixing, deliberation).

why

Hard agent tasks go better with a second agent reviewing. Whisper makes that pairing feel like one workflow instead of two terminals pretending to talk to each other.

ai-whisper was built this way: nearly every feature started as a spec run through its own spec-driven-development workflow.

features

structured workflows — spec-driven-development, ralph-loop, complex-bug-fixing, deliberation

single-baton relay — one owner at a time, not a swarm

role-agnostic — workflows assign the roles (today: implementer + reviewer)

autonomous loops gated by an llm evaluator

pausable, resumable runs — pause a healthy run mid-flight, recover a stuck one, don't restart

real mounted claude / codex / ezio sessions as the source of truth

workflows

A workflow is a structured loop — phases, gates, round budgets — not a long prompt. Pick by the shape of the work:

spec-driven-development — when you can describe “done” up front. Sharpens a spec into a plan, then executes it under review.

ralph-loop — when you can’t. Grinds an open-ended goal chunk-by-chunk, a reviewer gating each one.

complex-bug-fixing — when a bug is reported but the root cause isn’t. Three phases (diagnosis → fix-and-verify → post-mortem), with the implementer required to reproduce the bug — a committed failing test, not speculation from reading code.

deliberation — when the idea is still fuzzy and you can’t describe “done” yet. An Explorer and Challenger map the space (objectives → approaches → tradeoffs → synthesis), then hand back a committed findings doc to take into brainstorming or SDD.

Whichever you pick, you write the spec, goal, or bug report; the run converges on the criteria you set — so a precise artifact is most of the outcome.

Autonomous covers the loop, not the ship decision. A finished run is a strong draft: you still read the diff, run it, and QA it before it lands. ai-whisper does the convergence and hands you the full trail to judge fast.

see it run

A real spec-driven-development run: Claude (left) and Codex (middle) work in their own mounted sessions while the dashboard (right) tracks the baton handoffs and per-phase verdicts.

install

npm install -g ai-whisper