New version of "peers" – the AI couple doing things
peers is an open-source tool that drives two or more AI coding agents (Claude Code, Codex, etc.) as cooperating peers with hard gates: tests pass, coverage holds, no regression, no TODOs/stubs/skipped tests, secrets clean. One peer implements, the other blind-reviews, and an adversarial skeptic re-audits before acceptance. Runs unattended, budget-capped, and container-sandboxed.
Notifications You must be signed in to change notification settings
Fork 0
Star 2
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
8 Commits
8 Commits
auth-proxy
auth-proxy
docs
docs
proxy
proxy
scripts
scripts
src
src
tests
tests
.gitignore
.gitignore
Containerfile
Containerfile
Makefile
Makefile
README.md
README.md
README_DE.md
README_DE.md
compose.yaml
compose.yaml
pyproject.toml
pyproject.toml
pytest.ini
pytest.ini
Repository files navigation
Two AI coding agents are better than one — if you make them prove it.
peers drives n ≥ 2 AI coding CLIs (Claude Code, Codex, …) as cooperating peers that don't just agree a task is done — they have to clear hard, measurable gates first: tests pass, coverage holds, no regression, no TODO/stub/skipped-test, secrets clean. One peer implements, the other blind-reviews (without seeing the first's notes), and an adversarial skeptic re-audits before any "done" is accepted. Runs unattended, budget-capped, and container-sandboxed.
Why it beats a single agent on a loop:
Gated, not vibes-based. "Looks done" never converges — gates green + skeptic-clean does. No convergence theater.
Blind peer review catches rubber-stamping — an independent second pair of eyes, by construction.
An adversarial skeptic hunts the edge cases your tests miss.
Unattended & safe: idle-timeout supervision, USD/tick budget caps, rootless cap-dropped container, egress allow-listing.
In an instrumented diagnostic, peers built an expression-language interpreter both greenfield and brownfield to 0 defects over 50,000 random test programs — catching planted regressions and self-finding edge-case bugs the acceptance suite never probed.
Deutsche Version: README_DE.md.
HOWTO: full audit + fix on an existing app: docs/HOWTO-audit-and-fix.md — deutsche Anleitung
implement mode (build a feature from PLAN.md): docs/MODES_IMPLEMENT.md — DE
Security model: docs/SECURITY.md — DE
Quickstart (unattended, via the controller)
Path A — start from a fresh project (one shot)
peers-ctl new mything --modes=audit --spec ./mything-spec.md $EDITOR ~/c0de/peers-c0de/mything/.peers/goals.yaml # trim project-specific gates peers-ctl start mything --max-ticks 20 --max-usd 5
Available modes: see peers-ctl modes list. Stack multiple with --modes=audit,thorough. Current built-in modes:
Mode What it does
audit bug-hunt + 3-class test coverage + secrets + deps + API stability + regression + diff-size + skip/xfail justification
thorough anti-convergence-theater hard gate: N=3 consecutive clean ticks + skeptic-pass + aggressive-honesty soft goals
describe iterative doc-writing mode — peers write SPEC.md/ARCHITECTURE.md/DESIGN.md until N consecutive non-substantive doc commits. Use BEFORE audit on a repo that lacks docs; not composable with audit modes
implement end-to-end feature implementation from a markdown PLAN.md — frozen acceptance contract, blind-review between peers, reviewer-only checkoffs, HONESTY_AUDIT + cleanliness gates (no TODO/FIXME/stubs/skipped tests at convergence). Standalone; see docs/MODES_IMPLEMENT.md
Typical multi-mode runs:
audit + thorough (recommended default for an existing codebase):
peers-ctl new myapp --modes=audit,thorough
bare audit:
peers-ctl new myapp --modes=audit
write docs first, audit later (two separate runs):
peers-ctl new myapp --modes=describe # run 1 peers-ctl new myapp-audit --modes=audit,thorough # run 2
implement a feature from a PLAN.md (standalone — not composable):
peers-ctl new myfeature --container --modes=implement --plan ./PLAN.md
see docs/MODES_IMPLEMENT.md for the PLAN.md schema + escape valves.
Automatic hooks (opt-out flags):
recon pre-tick (default on): substrate scans the repo once before tick 1 and writes .peers/recon.md (detected languages, key docs, entry-point candidates, top-level tree). Free + fast — no LLM call. Eliminates the "blind tick 1" penalty. Opt out: peers-ctl start --without-recon.
codemap pre-tick (default on): substrate builds a structural CODEMAP from the AST and writes .peers/CODEMAP.yaml (machine-readable: every public symbol, its file:line and signature) plus .peers/codemap.md (a compact, byte-capped digest peers read as context). Free + fast — no LLM call. Primes peers with the codebase's public-API shape before tick 1, on top of recon's file-level view. Opt out: peers-ctl start --no-codemap.
auto-skeptic post-convergence (default on): when consecutive_clean_ticks >= N would fire convergence-reached, the orchestrator runs ONE extra tick with a critical re-audit prompt. If the skeptic-tick stays clean → really terminal. If it surfaces a new blocking bug → counter resets, loop continues. Opt out: peers-ctl start --without-post-convergence-skeptic.
peers-ctl new:
creates the directory if missing (refuses to scaffold into a non-empty dir unless --force);
bare name (no /) lands under $PEERS_PROJECTS_ROOT, default ~/c0de/peers-c0de/. Path with / is taken verbatim;
git init + initial scaffold commit;
ensures a top-level README.md exists, even when --force is used against an existing Git repo;
copies the --spec argument to SPEC.md (existing file paths are read; path-looking missing values such as ./typo.md are rejected);
runs peers init (which writes .peers/, tags peers-baseline, commits .gitignore, and creates .peers/log/runs.jsonl);
with --modes=audit, installs six audit check scripts and an audit-ready goals.yaml; use --lang=js, --lang=rust, or --lang=go for stack-specific check entrypoints;
registers the project with peers-ctl and creates the controller log under the peers-ctl config directory.
To use a different projects root (e.g. on a project-specific disk): export PEERS_PROJECTS_ROOT=/work/peers/ once, then bare names land there. peers-ctl doctor prints the active root.
Path B — bring your own existing project (first audit)
cd /path/to/your-target-project
peers init # writes .peers/ + commits .gitignore
$EDITOR .peers/goals.yaml # delete placeholder-replace-me, write real gates
python3 - [path] --modes=… # scaffold + register
peers-ctl add --name # register an EXISTING .peers/
peers-ctl start [] --container # start (--container = podman)
peers-ctl status [] # one or all
peers-ctl stop [] [--grace-s 10] # SIGTERM → wait → SIGKILL
peers-ctl remove # unregister (does NOT delete .peers/)
peers-ctl list # all projects + state
Observe
peers-ctl dashboard # rollup across all projects peers-ctl dashboard --live --refresh-s 1 # live rollup with alerts/events peers-ctl dashboard --project # recent runs + bug drilldown peers-ctl tail [] # follow controller log peers-ctl logs [-n 100] # print last N lines peers-ctl report [] # write controller REPORT-.md peers-ctl review # latest handoff's self-review block
Maintenance
peers-ctl doctor # pre-flight: peers + git + peer CLIs + image peers-ctl prune # delete old per-project log files
Common peers operations (inside a target repo)
peers -C /path/to/target init # write .peers/ peers -C /path/to/target run # start the loop in current shell peers -C /path/to/target run --max-ticks 5 # cap ticks peers -C /path/to/target run --max-usd 1 # cap budget (API-key billing only) peers -C /path/to/target status # iteration / next peer / lock peers -C /path/to/target info # config + goals snapshot peers -C /path/to/target verify # one-shot goal evaluation peers -C /path/to/target report # write .peers/REPORT.md peers -C /path/to/target replay # reconstruct any past tick peers -C /path/to/target tick --after claude # hooks-driver: trigger after a peer peers -C /path/to/target watch # follow runs.jsonl
Opt-out flags (defaults are on)
peers-ctl start --without-recon
Skip the substrate-only pre-tick recon step (no LLM call, free).
Only opt out if .peers/recon.md was hand-prepared.
peers-ctl start --no-codemap
Skip the substrate-only pre-tick structural CODEMAP step (no LLM call, free).
peers-ctl start --without-post-convergence-skeptic
Skip the auto-skeptic re-audit tick that fires when consecutive_clean_
ticks ≥ N would declare terminal. Default on for higher confidence;
opt out for CI runs where false-convergence is acceptable.
peers-ctl start --max-ticks 50 --max-usd 1
Same flags work on both peers-ctl and peers run directly.
peers run --help and peers-ctl start --help-man show the full flag set with descriptions.
Troubleshooting
peers-ctl start fails with pasta: Failed to open() /dev/net/tun
Rootless podman's default networking needs the tun kernel module. Bypass with host networking:
PEERS_CTL_PODMAN_NETWORK=host peers-ctl start --container
For permanent: echo 'export PEERS_CTL_PODMAN_NETWORK=host' >> ~/.bashrc, then source ~/.bashrc. Alternatively load the module: sudo modprobe tun (persist via /etc/modules-load.d/tun.conf).
Project shows crashed after convergence-complete
The orchestrator writes .peers/last-stop-reason.txt and reconcile maps clean reasons to stopped. If you still see crashed post-convergence:
cat .peers/last-stop-reason.txt — should contain complete .
make build to ensure the container image matches the host code.
tick 1 process-fail or idle-timeout
process-fail after ~4min usually = peer CLI returned 5xx (Anthropic Overloaded, Codex rate-limit) and idle-timeout kicked. Run produced no commit. Next tick retries the OTHER peer; the problematic peer auto-recovers if rate-limit was transient.
idle-timeout after exactly health.idle_timeout_s (default 900s) = peer wrote stdout below the silence threshold for too long. Increase idle_timeout_s in .peers/config.yaml for heavy DA mode runs (peer spends more time thinking before each commit).
peer-unavailable: exit_event
A halt-class pattern matched (authentication failed, quota exhausted, invalid API key, usage limit per templates/config.yaml). Operator action required:
Re-login or top-up the OAuth account
Restart: peers-ctl start --container
The loop resumes from the saved iteration
This is intentional — the substrate refuses to silently degrade peers on operator-action failures.
peers-ctl list shows fresh instead of stopped
fresh means the project was registered but NEVER started. After the first successful peers-ctl start, state moves to running, then stopped/crashed on exit. If you intended to start it: peers-ctl start --container.
Container-mode (--container)
If codex (or any other peer CLI) isn't on the host but is available in the peers:dev image, run the loop inside the container:
make build # one-time main image make proxy-build # egress sidecar make auth-proxy-build # Claude OAuth sidecar peers-ctl doctor # confirms podman + image exist peers-ctl start mything --container --max-ticks 20 --max-usd 5
This spawns podman run -d --rm --name ... --userns=keep-id ... peers:dev run … and tracks the running container by name via podman ps. The displayed PID is only the host-side podman logs -f streamer. peers-ctl stop --grace-s N uses podman stop -t N, then reaps the log streamer.
Container mode bind-mounts the target repo, ~/.claude, ~/.codex, and optional read-only ~/.gitconfig. When ~/.claude.json exists, it is mounted into the per-project peers-auth-proxy_ sidecar instead of the workspace container; the workspace talks to ANTHROPIC_BASE_URL=http://127.0.0.1:8080. Before launch, peers-ctl compares the host package version with peers --version inside the image: minor/patch drift warns, major drift refuses start until you rebuild (make build).
Override the image name with PEERS_CTL_IMAGE=name:tag if you've tagged your build differently.
Install (local development)
pip install -e .[dev] pytest # the full suite should pass
Single project — drive one repo
cd /path/to/your-project peers init $EDITOR .peers/goals.yaml # delete the placeholder, write your gates python3 - # reconstruct any iteration
pe
[truncated for AI cost control]