Riskratchet: Stop AI-generated code from rotting your codebase
Riskratchet is a Python tool that computes per-function risk scores based on coverage gaps, cyclomatic complexity, churn, public surface, and sprawl, then fails CI or blocks commits whenever risk grows past a baseline. It is designed to mechanically maintain code quality in AI-assisted development, catching regressions that human review might miss.
Notifications You must be signed in to change notification settings
Fork 0
Star 4
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
83 Commits
83 Commits
.github
.github
assets
assets
bin
bin
data/calibration
data/calibration
docs
docs
schemas
schemas
src/riskratchet
src/riskratchet
tests
tests
.gitignore
.gitignore
.pre-commit-config.yaml
.pre-commit-config.yaml
.pre-commit-hooks.yaml
.pre-commit-hooks.yaml
.riskratchet.json
.riskratchet.json
AGENTS.md
AGENTS.md
CHANGELOG.md
CHANGELOG.md
CONTRIBUTING.md
CONTRIBUTING.md
LICENSE
LICENSE
README.md
README.md
SECURITY.md
SECURITY.md
TODO.md
TODO.md
action.yml
action.yml
pyproject.toml
pyproject.toml
uv.lock
uv.lock
Repository files navigation
A maintainability ratchet for AI-assisted Python. The bar can only move down.
PyPI · Source · Post
AI coding agents are very good at writing code that compiles, runs, and passes the tests they ship with it. They are less good at:
writing meaningful tests for the new code,
noticing a 30-line function quietly became 130 lines,
catching that the public API now exposes a function with no callers in tests,
realising a small refactor turned an if ladder into a 14-way cyclomatic monster.
A traditional review catches some of this. A ratchet catches all of it, mechanically, every time. riskratchet computes a per-function risk score from coverage gaps, cyclomatic complexity, churn, public surface, and sprawl, then fails CI or blocks the commit whenever risk grows past a baseline. Nobody has to play complexity cop.
The review workflow is inspired by cargo-crap (which made the CRAP metric practical in CI with baselines, PR comments, and JSON output) and Cursor's thermo-nuclear-code-quality-review agent prompt (which emphasises maintainability, structure, sprawl, and explicit boundaries). riskratchet is neither a Python port of cargo-crap nor an agent prompt: it reports CRAP and adds Python-specific signals on top (branch gaps, churn, public surface, sprawl).
Quickstart
pip install riskratchet
or run without installing
uvx riskratchet --help
1. run your tests with coverage in JSON form
pytest --cov --cov-report=json:coverage.json
2. snapshot the current risk profile
riskratchet baseline src --coverage coverage.json --output .riskratchet.json
3. inspect what was captured
riskratchet scan src --coverage coverage.json
4. fail the build when risk regresses
riskratchet check src --coverage coverage.json --baseline .riskratchet.json
riskratchet check exits 1 on regressions, 2 on usage errors (e.g. missing baseline), and 0 otherwise.
For early adoption before a baseline exists, check --fail-above N gates on an absolute threshold without requiring a baseline (baseline gating remains the recommended mode for mature codebases):
No baseline yet: fail if any function scores above 60.
riskratchet check src --coverage coverage.json --fail-above 60
scan also exposes a no-baseline gate (different exit/output shape).
riskratchet scan src --coverage coverage.json --fail-above 75 riskratchet scan src --coverage coverage.json --fail-severity high
When --baseline and --fail-above are both given, the baseline gate is authoritative and --fail-above is ignored with a stderr warning.
Setting up riskratchet
riskratchet init scaffolds a [tool.riskratchet] section in pyproject.toml and prints a ready-to-paste CI snippet. With --with-baseline (or by saying yes to the interactive prompt on a TTY when pytest is detected), it also runs pytest --cov and creates the baseline in one go:
riskratchet init # write config, print snippet riskratchet init --with-baseline # also run pytest --cov + baseline riskratchet init --force # replace existing [tool.riskratchet]
riskratchet doctor is a six-check pre-flight that names whatever would make check fail to start (missing paths, missing/malformed baseline, missing/stale coverage, no git history, unknown config keys, invalid suppressions) and prints the exact fix command for each. The status table goes to stdout; the → fix: remediations go to stderr so you can pipe them separately:
riskratchet doctor # human-readable table + remediation riskratchet doctor --json # validates against schemas/doctor.schema.json riskratchet doctor 2>/dev/null # status table only riskratchet doctor >/dev/null # remediation commands only
doctor exits 0 only when every check is pass or warn; a single fail exits 1. The intended workflow is init → doctor → fix the warnings → baseline → check.
GitHub Action
The composite action ships in action.yml so adopters don't have to copy a workflow file — uses: KayhanB21/[email protected] is the canonical reference. The action installs riskratchet via uv tool install, runs check (--format pr-comment in both baseline and no-baseline modes), upserts a sticky PR comment, and surfaces the check exit status so PR checks reflect regressions.
.github/workflows/riskratchet.yml
on: [pull_request]
jobs: riskratchet: runs-on: ubuntu-latest permissions: contents: read pull-requests: write steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: KayhanB21/[email protected]
with: coverage: coverage.json
Inputs (defaults in parentheses): paths ([tool.riskratchet] paths), coverage (auto-detected), baseline (.riskratchet.json — when the file is missing, the action runs in --fail-above mode), fail-above (60), comment (true), python-version (3.12), riskratchet-version (latest from PyPI), github-token (${{ github.token }}).
For Marketplace discovery, the KayhanB21/riskratchet-action wrapper repo is the recommended entry point; it delegates to the root action.yml so both shapes share one source of truth.
Verifying releases
Every tagged release ships supply-chain provenance you can inspect: a CycloneDX SBOM of the wheel's runtime dependency closure (the sbom workflow artifact), a signed GitHub build-provenance attestation on the wheel and sdist, and PEP 740 PyPI attestations from Trusted Publishing. To confirm a downloaded wheel was built by this repo:
gh attestation verify riskratchet--py3-none-any.whl --owner KayhanB21
See docs/threat-model.md for what each artifact does and does not vouch for.
The canonical use case: AI agent + side project
You've been vibe-coding a FastAPI backend with an AI agent for eight months. It works, tests are green-ish (62% coverage), but you just noticed services/billing.py::reconcile_subscriptions quietly grew to 180 lines and an 11-way match statement you don't remember writing.
pip install riskratchet pytest --cov --cov-branch --cov-report=json:coverage.json riskratchet scan src --coverage coverage.json --top 10
reconcile_subscriptions shows up at score 71 (high) with structural_complexity: 90, sprawl: 55, coverage_gap: 60. You also spot a surprise: a 12-line public utility _normalize_plan_id scoring 48 because it has zero tests. Snapshot the bar:
riskratchet baseline src --coverage coverage.json --output .riskratchet.json git add .riskratchet.json && git commit -m "Add riskratchet baseline"
From here, every time the agent adds a webhook handler or "refactors the billing flow," run riskratchet check before committing. If it quietly bloated reconcile_subscriptions from 180 to 220 lines, the check exits 1 and names the regression. You stop having to remember to look.
Why this is the canonical use case: AI agents are excellent at adding code, mediocre at noticing they've made things worse. The baseline is your memory.
Other patterns
Team gating PRs in CI. Run pytest --cov and riskratchet check --format pr-comment in GitHub Actions; pipe to gh pr comment. The PR-comment format starts with so the bot updates the same comment on each push instead of spamming. The ratchet is mechanical and unowned, so nobody has to play "the complexity cop" in code review. See Using riskratchet from an AI coding agent.
Pre-commit hook for a solo repo. Wire pytest-cov and riskratchet into .pre-commit-config.yaml so every commit regenerates coverage and gates the commit on no regressions. See Pre-commit integration.
Investigating one ugly function. Use riskratchet explain path/to/file.py::qualname to dump the six component scores and find the driver (complexity vs. coverage vs. sprawl). After refactoring, run riskratchet diff --json | jq '.improved[], .regressed[]' to prove the change was net-positive, not just rearranging deck chairs.
Why CRAP alone is useful but incomplete
The classic CRAP score (CC^2 * (1 - line_coverage)^3 + CC) catches one shape of bad code: complex and poorly tested. That's a real problem, but it misses several others that ship to production just as often:
A function with low complexity and zero tests. CRAP gives it CC (a single digit). Risk is real but invisible.
A function with full line coverage but every branch covered the same way. CRAP only looks at line coverage.
A function in a 2,000-line module everyone is afraid to touch. Sprawl is invisible to CRAP.
A function that changed in 40 of the last 90 commits. Churn is invisible to CRAP.
riskratchet keeps CRAP as a reported metric and computes its own composite score from six weighted components so those other risks show up too.
Pre-commit integration
How pre-commit and riskratchet fit together
Two things about pre-commit matter for riskratchet:
Pre-commit hides your unstaged edits before running hooks. Hooks only see the code you're actually about to commit. Useful in general, but it means riskratchet sees a different source tree than the one open in your editor.
Each language: python hook runs in its own isolated virtualenv that contains riskratchet and its declared deps, not your project's pytest, application code, or test plugins.
Together these create one requirement: the coverage.json riskratchet reads must reflect the same stashed source tree it's analyzing. Reusing an old coverage.json from before pre-commit stashed your edits drifts source and coverage out of sync.
That's why the published hook ships with --no-auto-cov --allow-missing-coverage by default: safe but limited. Pick one of the patterns below to make it useful.
Pattern A: pre-generate coverage in a sibling hook (recommended)
Run pytest --cov inside the same pre-commit chain so the coverage matches the stashed tree exactly.
repos:
- repo: local
hooks:
- id: pytest-cov
name: pytest --cov (produces coverage.json for riskratchet) entry: pytest --cov --cov-branch --cov-report=json:coverage.json -q language: system pass_filenames: false always_run: true
- repo: https://github.com/KayhanB21/riskratchet
rev: v0.2.13 hooks:
- id: riskratchet
args:
- "src"
- "--coverage"
- "coverage.json"
- "--baseline"
- ".riskratchet.json"
Variant: uv / poetry projects (all language: system)
Skip the isolated venv entirely and run both hooks inside your project's environment. This is what riskratchet itself uses:
repos:
- repo: local
hooks:
- id: pytest-cov
entry: uv run pytest --cov --cov-branch --cov-report=json:coverage.json -q language: system pass_filenames: false always_run: true
- id: riskratchet
entry: uv run riskratchet check src --coverage coverage.json --baseline .riskratchet.json --no-auto-cov language: system pass_filenames: false always_run: true
Two upsides: single env for both hooks (no isolated-venv surprises), and uv run resolves the same Python and deps uv sync set up. Downside: contributors must have uv installed locally.
Pattern B: let riskratchet run pytest itself
Override the hook to language: system so it inherits your shell PATH (and finds your real pytest):
repos:
- repo: local
hooks:
- id: riskratchet
entry: riskratchet check src --baseline .riskratchet.json language: system pass_filenames: false always_run: true
riskratchet run
[truncated for AI cost control]