AI News HubLIVE
站内改写5 min read

New version of "peers" – the AI couple doing things

peers is an open-source tool that drives two or more AI coding agents (Claude Code, Codex, etc.) as cooperating peers with hard gates: tests pass, coverage holds, no regression, no TODOs/stubs/skipped tests, secrets clean. One peer implements, the other blind-reviews, and an adversarial skeptic re-audits before acceptance. Runs unattended, budget-capped, and container-sandboxed.

SourceHacker News AIAuthor: dash0r

Notifications You must be signed in to change notification settings

Fork 0

Star 2

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

8 Commits

8 Commits

auth-proxy

auth-proxy

docs

docs

proxy

proxy

scripts

scripts

src

src

tests

tests

.gitignore

.gitignore

Containerfile

Containerfile

Makefile

Makefile

README.md

README.md

README_DE.md

README_DE.md

compose.yaml

compose.yaml

pyproject.toml

pyproject.toml

pytest.ini

pytest.ini

Repository files navigation

Two AI coding agents are better than one — if you make them prove it.

peers drives n ≥ 2 AI coding CLIs (Claude Code, Codex, …) as cooperating peers that don't just agree a task is done — they have to clear hard, measurable gates first: tests pass, coverage holds, no regression, no TODO/stub/skipped-test, secrets clean. One peer implements, the other blind-reviews (without seeing the first's notes), and an adversarial skeptic re-audits before any "done" is accepted. Runs unattended, budget-capped, and container-sandboxed.

Why it beats a single agent on a loop:

Gated, not vibes-based. "Looks done" never converges — gates green + skeptic-clean does. No convergence theater.

Blind peer review catches rubber-stamping — an independent second pair of eyes, by construction.

An adversarial skeptic hunts the edge cases your tests miss.

Unattended & safe: idle-timeout supervision, USD/tick budget caps, rootless cap-dropped container, egress allow-listing.

In an instrumented diagnostic, peers built an expression-language interpreter both greenfield and brownfield to 0 defects over 50,000 random test programs — catching planted regressions and self-finding edge-case bugs the acceptance suite never probed.

Deutsche Version: README_DE.md.

HOWTO: full audit + fix on an existing app: docs/HOWTO-audit-and-fix.md — deutsche Anleitung

implement mode (build a feature from PLAN.md): docs/MODES_IMPLEMENT.md — DE

Security model: docs/SECURITY.md — DE

Quickstart (unattended, via the controller)

Path A — start from a fresh project (one shot)

peers-ctl new mything --modes=audit --spec ./mything-spec.md $EDITOR ~/c0de/peers-c0de/mything/.peers/goals.yaml # trim project-specific gates peers-ctl start mything --max-ticks 20 --max-usd 5

Available modes: see peers-ctl modes list. Stack multiple with --modes=audit,thorough. Current built-in modes:

Mode What it does

audit bug-hunt + 3-class test coverage + secrets + deps + API stability + regression + diff-size + skip/xfail justification

thorough anti-convergence-theater hard gate: N=3 consecutive clean ticks + skeptic-pass + aggressive-honesty soft goals

describe iterative doc-writing mode — peers write SPEC.md/ARCHITECTURE.md/DESIGN.md until N consecutive non-substantive doc commits. Use BEFORE audit on a repo that lacks docs; not composable with audit modes

implement end-to-end feature implementation from a markdown PLAN.md — frozen acceptance contract, blind-review between peers, reviewer-only checkoffs, HONESTY_AUDIT + cleanliness gates (no TODO/FIXME/stubs/skipped tests at convergence). Standalone; see docs/MODES_IMPLEMENT.md

Typical multi-mode runs:

audit + thorough (recommended default for an existing codebase):

peers-ctl new myapp --modes=audit,thorough

bare audit:

peers-ctl new myapp --modes=audit

write docs first, audit later (two separate runs):

peers-ctl new myapp --modes=describe # run 1 peers-ctl new myapp-audit --modes=audit,thorough # run 2

implement a feature from a PLAN.md (standalone — not composable):

peers-ctl new myfeature --container --modes=implement --plan ./PLAN.md

see docs/MODES_IMPLEMENT.md for the PLAN.md schema + escape valves.

Automatic hooks (opt-out flags):

recon pre-tick (default on): substrate scans the repo once before tick 1 and writes .peers/recon.md (detected languages, key docs, entry-point candidates, top-level tree). Free + fast — no LLM call. Eliminates the "blind tick 1" penalty. Opt out: peers-ctl start --without-recon.

codemap pre-tick (default on): substrate builds a structural CODEMAP from the AST and writes .peers/CODEMAP.yaml (machine-readable: every public symbol, its file:line and signature) plus .peers/codemap.md (a compact, byte-capped digest peers read as context). Free + fast — no LLM call. Primes peers with the codebase's public-API shape before tick 1, on top of recon's file-level view. Opt out: peers-ctl start --no-codemap.

auto-skeptic post-convergence (default on): when consecutive_clean_ticks >= N would fire convergence-reached, the orchestrator runs ONE extra tick with a critical re-audit prompt. If the skeptic-tick stays clean → really terminal. If it surfaces a new blocking bug → counter resets, loop continues. Opt out: peers-ctl start --without-post-convergence-skeptic.

peers-ctl new:

creates the directory if missing (refuses to scaffold into a non-empty dir unless --force);

bare name (no /) lands under $PEERS_PROJECTS_ROOT, default ~/c0de/peers-c0de/. Path with / is taken verbatim;

git init + initial scaffold commit;

ensures a top-level README.md exists, even when --force is used against an existing Git repo;

copies the --spec argument to SPEC.md (existing file paths are read; path-looking missing values such as ./typo.md are rejected);

runs peers init (which writes .peers/, tags peers-baseline, commits .gitignore, and creates .peers/log/runs.jsonl);

with --modes=audit, installs six audit check scripts and an audit-ready goals.yaml; use --lang=js, --lang=rust, or --lang=go for stack-specific check entrypoints;

registers the project with peers-ctl and creates the controller log under the peers-ctl config directory.

To use a different projects root (e.g. on a project-specific disk): export PEERS_PROJECTS_ROOT=/work/peers/ once, then bare names land there. peers-ctl doctor prints the active root.

Path B — bring your own existing project (first audit)

cd /path/to/your-target-project peers init # writes .peers/ + commits .gitignore $EDITOR .peers/goals.yaml # delete placeholder-replace-me, write real gates python3 - [path] --modes=… # scaffold + register peers-ctl add --name # register an EXISTING .peers/ peers-ctl start [] --container # start (--container = podman) peers-ctl status [] # one or all peers-ctl stop [] [--grace-s 10] # SIGTERM → wait → SIGKILL peers-ctl remove # unregister (does NOT delete .peers/) peers-ctl list # all projects + state

Observe

peers-ctl dashboard # rollup across all projects peers-ctl dashboard --live --refresh-s 1 # live rollup with alerts/events peers-ctl dashboard --project # recent runs + bug drilldown peers-ctl tail [] # follow controller log peers-ctl logs [-n 100] # print last N lines peers-ctl report [] # write controller REPORT-.md peers-ctl review # latest handoff's self-review block

Maintenance

peers-ctl doctor # pre-flight: peers + git + peer CLIs + image peers-ctl prune # delete old per-project log files

Common peers operations (inside a target repo)

peers -C /path/to/target init # write .peers/ peers -C /path/to/target run # start the loop in current shell peers -C /path/to/target run --max-ticks 5 # cap ticks peers -C /path/to/target run --max-usd 1 # cap budget (API-key billing only) peers -C /path/to/target status # iteration / next peer / lock peers -C /path/to/target info # config + goals snapshot peers -C /path/to/target verify # one-shot goal evaluation peers -C /path/to/target report # write .peers/REPORT.md peers -C /path/to/target replay # reconstruct any past tick peers -C /path/to/target tick --after claude # hooks-driver: trigger after a peer peers -C /path/to/target watch # follow runs.jsonl

Opt-out flags (defaults are on)

peers-ctl start --without-recon

Skip the substrate-only pre-tick recon step (no LLM call, free).

Only opt out if .peers/recon.md was hand-prepared.

peers-ctl start --no-codemap

Skip the substrate-only pre-tick structural CODEMAP step (no LLM call, free).

peers-ctl start --without-post-convergence-skeptic

Skip the auto-skeptic re-audit tick that fires when consecutive_clean_

ticks ≥ N would declare terminal. Default on for higher confidence;

opt out for CI runs where false-convergence is acceptable.

peers-ctl start --max-ticks 50 --max-usd 1

Same flags work on both peers-ctl and peers run directly.

peers run --help and peers-ctl start --help-man show the full flag set with descriptions.

Troubleshooting

peers-ctl start fails with pasta: Failed to open() /dev/net/tun

Rootless podman's default networking needs the tun kernel module. Bypass with host networking:

PEERS_CTL_PODMAN_NETWORK=host peers-ctl start --container

For permanent: echo 'export PEERS_CTL_PODMAN_NETWORK=host' >> ~/.bashrc, then source ~/.bashrc. Alternatively load the module: sudo modprobe tun (persist via /etc/modules-load.d/tun.conf).

Project shows crashed after convergence-complete

The orchestrator writes .peers/last-stop-reason.txt and reconcile maps clean reasons to stopped. If you still see crashed post-convergence:

cat .peers/last-stop-reason.txt — should contain complete .

make build to ensure the container image matches the host code.

tick 1 process-fail or idle-timeout

process-fail after ~4min usually = peer CLI returned 5xx (Anthropic Overloaded, Codex rate-limit) and idle-timeout kicked. Run produced no commit. Next tick retries the OTHER peer; the problematic peer auto-recovers if rate-limit was transient.

idle-timeout after exactly health.idle_timeout_s (default 900s) = peer wrote stdout below the silence threshold for too long. Increase idle_timeout_s in .peers/config.yaml for heavy DA mode runs (peer spends more time thinking before each commit).

peer-unavailable: exit_event

A halt-class pattern matched (authentication failed, quota exhausted, invalid API key, usage limit per templates/config.yaml). Operator action required:

Re-login or top-up the OAuth account

Restart: peers-ctl start --container

The loop resumes from the saved iteration

This is intentional — the substrate refuses to silently degrade peers on operator-action failures.

peers-ctl list shows fresh instead of stopped

fresh means the project was registered but NEVER started. After the first successful peers-ctl start, state moves to running, then stopped/crashed on exit. If you intended to start it: peers-ctl start --container.

Container-mode (--container)

If codex (or any other peer CLI) isn't on the host but is available in the peers:dev image, run the loop inside the container:

make build # one-time main image make proxy-build # egress sidecar make auth-proxy-build # Claude OAuth sidecar peers-ctl doctor # confirms podman + image exist peers-ctl start mything --container --max-ticks 20 --max-usd 5

This spawns podman run -d --rm --name ... --userns=keep-id ... peers:dev run … and tracks the running container by name via podman ps. The displayed PID is only the host-side podman logs -f streamer. peers-ctl stop --grace-s N uses podman stop -t N, then reaps the log streamer.

Container mode bind-mounts the target repo, ~/.claude, ~/.codex, and optional read-only ~/.gitconfig. When ~/.claude.json exists, it is mounted into the per-project peers-auth-proxy_ sidecar instead of the workspace container; the workspace talks to ANTHROPIC_BASE_URL=http://127.0.0.1:8080. Before launch, peers-ctl compares the host package version with peers --version inside the image: minor/patch drift warns, major drift refuses start until you rebuild (make build).

Override the image name with PEERS_CTL_IMAGE=name:tag if you've tagged your build differently.

Install (local development)

pip install -e .[dev] pytest # the full suite should pass

Single project — drive one repo

cd /path/to/your-project peers init $EDITOR .peers/goals.yaml # delete the placeholder, write your gates python3 - # reconstruct any iteration

pe

[truncated for AI cost control]