2026-06-08站内改写6 min readUpdated: 2026-06-08

Lean – Two Claude Code skills that stop the AI from over-engineering

Lean is a Claude Code plugin that uses two skills (think-twice and surgical) to dramatically reduce token waste. It draws from lean manufacturing to prevent AI from over-engineering. Benchmarks show median 8× token savings across 17 tasks, with peak savings of 178×. The article explains the problem, the skills, real-world examples, installation, and when not to apply.

SourceHacker News AIAuthor: mamba99

Notifications You must be signed in to change notification settings

Fork 1

Star 17

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

34 Commits

.claude-plugin

assets

commands

skills

tests

.gitignore

CLAUDE.md

LICENSE

README.md

Repository files navigation

The best tokens are the ones you never spent.

"A great engineer is a lazy engineer. They find the clever shortcut." — Steve Jobs

lean is a Claude Code plugin that gives your AI the instinct great engineers are known for: pause before working hard, and make sure you can't work smart instead.

The Problem: AI Agents Are Wasteful

Lean manufacturing has a word for unnecessary work: muda. Waste. Toyota built the world's most efficient production system by obsessing over eliminating it.

AI agents have a muda problem. Given any task, Claude charges ahead with the most obvious implementation — thorough, from scratch, at full cost — without stopping to ask: is there a smarter path? And once it's writing, it adds everything it can think of: error handling, tests, abstractions, refactors — none of which was asked for.

The result: thousands of unnecessary tokens. Work that didn't need to happen. Waste.

lean fixes this at the only two moments that matter.

Two Skills. Two Moments.

Skill When it fires What it prevents

think-twice Before picking an approach Implementing from scratch when an API, package, or one-liner already exists

surgical Before writing each block Adding error handling, tests, and abstractions nobody asked for

think-twice asks: is there a smarter path? surgical asks: did the user actually ask for this?

Together they enforce lean at every level — strategy and execution.

Token Cost at a Glance

Ask Claude to generate 500 staging user profiles. Without lean, it writes every profile inline — all 500, field by field, 66,320 tokens of output. With lean, it writes a 54-line faker script instead. 372 tokens.

Without lean: ~66,320 tokens — about $1.00 at Claude Sonnet API pricing. With lean: ~372 tokens — about half a cent. Same result. 178× the cost.

That's not an edge case. That's the default behavior of every AI that hasn't been taught to think first.

Task Greedy Lean Multiplier

500 fake user profiles ~66,320 tok ~372 tok 178×

File rename script ~725 tok ~19 tok 38×

Email validation ~1,675 tok ~93 tok 18×

Airport code lookup ~1,710 tok ~93 tok 18×

Bug fix — parse_date ~962 tok ~61 tok 16×

Phone number input ~1,525 tok ~98 tok 16×

Recent searches ~1,010 tok ~73 tok 14×

Live currency conversion ~1,795 tok ~134 tok 13×

Dark mode toggle ~962 tok ~117 tok 8×

Business day calculator ~410 tok ~58 tok 7×

Deep clone fix ~287 tok ~40 tok 7×

City autocomplete ~2,460 tok ~410 tok 6×

Rate limiter — sliding window ~2,152 tok ~414 tok 5×

User auth setup ~967 tok ~190 tok 5×

Pagination ~995 tok ~203 tok 5×

Console.log for debugging ~419 tok ~106 tok 4×

PDF invoice generation ~4,281 tok ~2,281 tok 2×

These seventeen tasks — a normal vibe-coding afternoon — cost 88,655 tokens greedy vs. 4,762 tokens lean. That's a $1.10 difference, every time, without changing a single prompt.

Real outputs from 17 benchmark scenarios, tested independently under three conditions each: think-twice only, surgical only, and both combined. Three-way breakdown →

The gap isn't narrow. Across 17 real tasks — bug fixes, scripts, API integrations, data generation — savings range from 2× to 178×, with a median of 8×.

That spread exists because the waste doesn't come from one place. There are two independent failure modes.

Scope creep is Claude adding what you didn't ask for — --dry-run flags, docstrings, error handling, test suites — on top of a task with a fixed, bounded answer. The task is small; the creep is not. surgical catches this.

Wrong strategy is Claude picking the expensive path when a library, API, or built-in already solves it correctly and completely. 124 airports hardcoded when there are 10,000. A holiday set that expires January 1. Hand-rolled deepClone when structuredClone() is a built-in. think-twice catches this.

These aren't variations of the same problem — a task can trigger one, both, or neither. Which is why the skills are separate.

surgical catches more scenarios by count. think-twice catches the expensive ones — the 178× outlier lives in that slice. When both failure modes are present, the multipliers stack.

One honest caveat: in 3 of 17 scenarios (dark mode toggle, pagination, user auth setup), surgical alone outperformed both skills combined. When think-twice redirects to a library whose setup boilerplate exceeds a minimal hand-rolled solution, adding it hurts. The skills are not always additive — which is why they're separate, and why the full three-way breakdown shows every condition.

Real-World Examples

"Generate 500 realistic user profiles for our staging database"

Greedy Lean

Approach Writes 500 JSON records inline 54-line @faker-js/faker script, parameterized

Tokens ~66,320 ~372 — 178x fewer

Data quality Repetitive (~30 names recycled) Statistically varied, 50+ locales

Bcrypt hashes Fake hashes — not login-usable Real hashes — login-usable

Re-runnability Zero — ephemeral output Seeded, version-controlled, --count flag

Checkpoints — think-twice #2 (faker) + #3 (500 static = wrong shape)

"Write a script to rename all .jpeg files to .jpg in this directory"

Greedy Lean

Output 110-line CLI — argparse with --dry-run, --recursive, --verbose, --directory, logging setup, per-file try/except, renamed-file counter, type hints, main() guard 3-line pathlib loop

Tokens ~725 ~19 — 38x fewer

Flags added 4 (--dry-run, --recursive, --verbose, --directory) 0

think-twice Correctly does not fire — pathlib is already the right tool —

Checkpoint — surgical — user asked for a script, not a CLI tool

"Add email validation to our signup form"

Greedy Lean

Approach RFC 5322 regex + 65-entry disposable domain blocklist + live MX/SMTP probe + lru_cache 4-line compiled regex, stdlib re only

Tokens ~1,675 ~93 — 18x fewer

Live network call On every validation (SMTP probe) Does not exist

Strings to maintain 65 hardcoded disposable domains 0

Dependencies smtplib, socket, logging, lru_cache None beyond stdlib

Checkpoint — surgical — "validate email" ≠ "build a validation module"

"Map airport IATA codes to city names for our flight search"

Greedy Lean

Approach Hardcodes ~124 airports as a static object npm install airports + 5-line lookup

Tokens ~1,710 ~93 — 18x fewer

Airport coverage 124 of ~10,000 IATA codes (1.2%) All ~10,000

"TXL", "CGK", "DOH" Not found Covered

Correctness Wrong for 98.8% of airports Complete

Checkpoint — think-twice #2 — existing package

"Fix the off-by-one error in parse_date"

Greedy Lean

Output Bug fix + type annotations + input validation + docstring + 13 unit tests + logging The one-line fix, nothing else

Tokens ~962 ~61 — 16x fewer

Reviewability User must audit 3,847 chars they never requested User reviews exactly what they asked for

Result: "Fixed the off-by-one on line 5 — removed the + 1. Didn't add validation or tests; let me know if you want those."

Install

Option 1 — CLAUDE.md

Add this to your project's CLAUDE.md. Unlike skills, CLAUDE.md is always in context — no reliance on Claude's judgment about when to apply it:

Before any substantial coding task (new feature, data generation, implementation over ~20 lines): pause and check — does a public API, package, or one-liner already solve this? If yes, use it. Only then proceed with the minimum that solves the problem today.

Before writing each code block: build only what was explicitly asked for. Do not add error handling, tests, type annotations, docstrings, or abstractions unless requested. If something seems worth adding, say so after delivering the output — don't add it unilaterally.

Skip both rules for: bug fixes under ~10 lines, infra/terraform/k8s, DB queries, or when the user explicitly asked for a complete or production-ready implementation.

Option 2 — Claude Code skills

Skills load their full rulebook when invoked manually or when Claude judges the context matches. Better for on-demand use or projects where you don't want these rules active at all times.

Via plugin system:

/plugin marketplace add albertobarnabo/lean /plugin install lean@lean

First register the repo as a marketplace, then install the lean plugin from it. The @lean suffix is the marketplace name. Restart your session afterward so the skills load.

Via curl (installs both skills):

BASE="https://raw.githubusercontent.com/albertobarnabo/lean/main/skills" for skill in think-twice surgical; do curl -sL "$BASE/$skill/SKILL.md" -o ~/.claude/skills/$skill/SKILL.md --create-dirs done

Single skill only:

think-twice only

curl -sL https://raw.githubusercontent.com/albertobarnabo/lean/main/skills/think-twice/SKILL.md \ -o ~/.claude/skills/think-twice/SKILL.md --create-dirs

surgical only

curl -sL https://raw.githubusercontent.com/albertobarnabo/lean/main/skills/surgical/SKILL.md \ -o ~/.claude/skills/surgical/SKILL.md --create-dirs

Manual invocation (force a skill on a specific task):

Command What it does

/lean:think-twice Run the full lean checklist before starting

/lean:surgical Implement with zero scope creep — exactly what was asked

When NOT to apply

These skills are not dogma. Override them when:

Situation Why to override

Security-critical code Always use stdlib or a widely-audited library — never a shortcut

Latency-sensitive hot path A runtime API call adds unacceptable delay

Offline-first / zero-dependency env External solutions not available

The shortcut is the overkill Don't add a library for 5 trivial lines

You explicitly asked for extras surgical doesn't apply when scope expansion is the request

In all cases, Claude proceeds — and states why it's overriding.

The Philosophy

Lean thinking is not about doing less carelessly. It's about doing exactly what creates value — and cutting everything else before it costs you.

Steve Jobs wasn't romanticizing laziness. He was describing the highest form of engineering judgment: the discipline to stop before the obvious path, find the clever one, and take only that.

Most AI coding tools optimize for doing more. They generate thoroughly, completely, defensively — because generating is what they're good at.

lean optimizes for doing right. Two questions, two moments, before the tokens flow:

Is there a smarter path? Is this exactly what was asked?

That's it. The rest follows.

Contributors

@albertobarnabo — author

@ayoubighissou99 — co-author

Contributing

Found a new waste pattern — a task where Claude defaults to the expensive path when a better one exists? Open a PR:

A new shortcut row in an existing skill's table

A new skill for a pattern not yet covered

A real token-cost comparison from your own usage

The best contributions, like the best code, are the ones that do exactly what's needed — nothing more.

MIT License

About

Teaches Claude to find the clever path before taking the obvious one. 8× fewer tokens on the median real-world task — measured across 17 benchmarks.

Topics

skill

claude

cost-saving

llm

anthropic

ai-productivity

claude-code

token-optimization

token-efficiency

claude-skills

think-twice

Resources

Readme

License

MIT license

Uh oh!

There was an error while loading. Please reload this page.

Activity

Stars

17 stars

Watchers

0 watching

Forks

1 fork

Report repository

Releases

No releases published

Packages 0

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.