AI News HubLIVE
In-site rewrite4 min read

Show HN: ANMA, boundary contracts for cheaper AI coding agents

ANMA is an open-source tool that enforces module boundaries for AI coding agents using plain-YAML contracts. It generates CLAUDE.md, hooks, and CI checks to keep agents like Claude Code within architecture. Benchmarks show it reduces violations from 68% to 0% for cheaper models (Haiku 4.5) while providing insurance for frontier models. Supports Python, Go, TypeScript; lightweight (~800 lines) with enterprise features like drift detection and incremental adoption.

SourceHacker News AIAuthor: nxy

Notifications You must be signed in to change notification settings

Fork 0

Star 0

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

132 Commits

132 Commits

.claude

.claude

.github

.github

anma

anma

benchmarks

benchmarks

docs

docs

tests

tests

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

CHANGELOG.md

CHANGELOG.md

CLAUDE.md

CLAUDE.md

CONTRIBUTING.md

CONTRIBUTING.md

DECISIONS.md

DECISIONS.md

LICENSE

LICENSE

README.md

README.md

RELEASE.md

RELEASE.md

SECURITY.md

SECURITY.md

anma.yaml

anma.yaml

pyproject.toml

pyproject.toml

tach.toml

tach.toml

Repository files navigation

Boundary enforcement for AI coding agents. ANMA turns plain-YAML module contracts into the CLAUDE.md, hooks, and checks that keep Claude Code inside your architecture — and it measurably works where it matters most.

In a controlled benchmark (Python), a cheaper/faster model (Claude Haiku 4.5) violated a declared module boundary in 13 of 19 runs of a plain repo. With ANMA, across 20 runs of the same task it violated it 0 times (Fisher's exact p < 0.0001). See docs/BENCHMARKS.md for the full study, including the honest part: a frontier model (Opus 4.8) respected the boundary on its own, so ANMA's value is insurance for running cheaper agents plus a CI/governance guarantee — not making a frontier model smarter.

Languages: Python, Go, and TypeScript (language: in the root anma.yaml, one per project). Go and TypeScript enforce module→module dependencies; interface (public:) enforcement is Python-only today. The Go/TS adapters are validated (anma check + the hook detect and block real cross-module violations). In a pre-registered follow-up (neutral prompt, harder scenario), TypeScript shows a measured effect — control 18/20 vs ANMA 0/20, Fisher's exact p < 0.00001; Go is directional and significant (10/30 → 0/30, p = 0.0004) but its control rate fell below our pre-registered 0.40 floor, so we report it as suggestive, not yet efficacy. The Python headline is not extrapolated to either language. Details: CONCEPTS § Languages and BENCHMARKS.

What it does

You declare each module's public interface and what it may depend on. anma sync compiles that into everything else, so the architecture the agent reads can never drift from the rules CI enforces:

anma.yaml project config (schema_version, source_roots) src/domains/billing/ anma.yaml the module contract — see docs/CONCEPTS.md for all fields CLAUDE.md (generated) loads when Claude opens billing/ CLAUDE.md (generated) architecture map, between markers .claude/rules/boundaries.md (generated) always-loaded imperative .claude/hooks/anma_pretooluse.py (generated) blocks a boundary-breaking edit (exit 2) tach.toml (generated) engine config (Go: .go-arch-lint.yml; TS: .dependency-cruiser.cjs) .github/workflows/anma.yml (generated) CI: drift check + boundary check DECISIONS.md append-only: why each boundary exists

Quickstart (60 seconds)

pip install anma[tach] # tach backend recommended; works without it too anma init # scaffolds contracts + a worked accounts/billing example anma sync # generates CLAUDE.md, nested docs, hooks, tach.toml, CI anma check # ✓ boundaries respected

For Go or TypeScript, scaffold with anma init --language go / anma init --language typescript (the external backends — go-arch-lint, dependency-cruiser — are optional; a builtin scanner is the zero-dep fallback).

Full walkthrough: docs/QUICKSTART.md.

Commands

anma init # scaffold contracts + a worked example anma sync # regenerate all artifacts from contracts anma sync --check # CI guard: fail if generated artifacts drifted from contracts anma check # enforce boundaries (hook / pre-commit / CI) anma check --warn # report violations but exit 0 (incremental adoption) anma check --json # machine-readable output for pipelines

Exit codes: 0 ok · 1 violations, contract errors, or drift.

Two layers: guidance and enforcement

ANMA works at two levels, and the benchmark shows they play different roles:

Guidance — the generated root and per-module CLAUDE.md and .claude/rules put your architecture in the agent's context. This is what drove the 68% → 0 result: the model was steered to the correct design and didn't attempt a bad edit.

Enforcement — the PreToolUse hook judges the proposed edit and returns exit 2 to block any new disallowed import before it lands; the same check runs at pre-commit and in CI. This is the guarantee that holds for the edits guidance doesn't catch, and regardless of which model or human wrote the diff.

The enforcement hook is verified to fire (feed it a forbidden edit → exit 2); in the benchmark it never needed to, because guidance pre-empted every bad edit. Both matter; see the benchmarks for exactly what each one is shown to do.

Who it's for

Teams running cheaper or faster agents (cost-sensitive pipelines, bulk tasks, non-frontier or non-Claude models) that don't reliably respect an architecture on their own — this is where ANMA's steering is decisive.

Anyone who wants an enforced architecture: a guarantee in CI/pre-commit that module boundaries hold no matter who or what wrote the change.

Teams that want architecture as governance: declared interfaces, ownership → CODEOWNERS, and docs that can't silently drift from the rules.

If you only ever drive a frontier model on small, well-described tasks, ANMA may add turns without changing outcomes — and the benchmarks say so plainly.

Lightweight by design

~800 lines, no runtime, no DSL, one small dependency (PyYAML) — the builtin engine needs nothing more, and the faster external backends (tach for Python, go-arch-lint for Go, dependency-cruiser for TypeScript) are all optional. A security team can read the whole tool in an afternoon.

Enterprise

Drift detection — anma sync --check fails CI if generated docs/config fall out of sync with the contracts.

Incremental adoption — anma check --warn and per-module deprecated_deps let a large codebase adopt without a red build on day one.

Governance — owners: per module generates CODEOWNERS; source_roots: supports monorepos.

Supply chain — signed releases (PyPI Trusted Publishing + provenance + SBOM), pip-audit in CI, Apache-2.0. See SECURITY.md.

Documentation

docs/QUICKSTART.md — install to first blocked edit

docs/CONCEPTS.md — the model, the contract schema reference, generated artifacts, the engine

docs/BENCHMARKS.md — the with/without study, methodology, and honest limits

CONTRIBUTING.md — dev setup, tests, the dogfood, the schema-stability rule

SECURITY.md · RELEASE.md · CHANGELOG.md

Apache-2.0 · ANMA Labs LLC

About

Boundary enforcement for AI coding agents — plain-YAML contracts compiled into CLAUDE.md, hooks, and CI checks.

anmalabs.dev

Topics

python

ci

static-analysis

monorepo

developer-tools

code-quality

software-architecture

ai-agents

llm

agentic-ai

claude-code

module-boundaries

Resources

Readme

License

Apache-2.0 license

Contributing

Contributing

Security policy

Security policy

Uh oh!

There was an error while loading. Please reload this page.

Activity

Custom properties

Stars

0 stars

Watchers

0 watching

Forks

0 forks

Report repository

Releases 4

v0.7.0 — Go & TypeScript

Latest

Jun 7, 2026

+ 3 releases

Packages 0

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

Languages

Python 95.2%

Go 2.5%

TypeScript 1.8%

JavaScript 0.5%