Lean – Two Claude Code skills that stop the AI from over-engineering
Lean is a Claude Code plugin that uses two skills (think-twice and surgical) to dramatically reduce token waste. It draws from lean manufacturing to prevent AI from over-engineering. Benchmarks show median 8× token savings across 17 tasks, with peak savings of 178×. The article explains the problem, the skills, real-world examples, installation, and when not to apply.
Notifications You must be signed in to change notification settings
Fork 1
Star 17
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
34 Commits
34 Commits
.claude-plugin
.claude-plugin
assets
assets
commands
commands
skills
skills
tests
tests
.gitignore
.gitignore
CLAUDE.md
CLAUDE.md
LICENSE
LICENSE
README.md
README.md
Repository files navigation
The best tokens are the ones you never spent.
"A great engineer is a lazy engineer. They find the clever shortcut." — Steve Jobs
lean is a Claude Code plugin that gives your AI the instinct great engineers are known for: pause before working hard, and make sure you can't work smart instead.
The Problem: AI Agents Are Wasteful
Lean manufacturing has a word for unnecessary work: muda. Waste. Toyota built the world's most efficient production system by obsessing over eliminating it.
AI agents have a muda problem. Given any task, Claude charges ahead with the most obvious implementation — thorough, from scratch, at full cost — without stopping to ask: is there a smarter path? And once it's writing, it adds everything it can think of: error handling, tests, abstractions, refactors — none of which was asked for.
The result: thousands of unnecessary tokens. Work that didn't need to happen. Waste.
lean fixes this at the only two moments that matter.
Two Skills. Two Moments.
Skill When it fires What it prevents
think-twice Before picking an approach Implementing from scratch when an API, package, or one-liner already exists
surgical Before writing each block Adding error handling, tests, and abstractions nobody asked for
think-twice asks: is there a smarter path? surgical asks: did the user actually ask for this?
Together they enforce lean at every level — strategy and execution.
Token Cost at a Glance
Ask Claude to generate 500 staging user profiles. Without lean, it writes every profile inline — all 500, field by field, 66,320 tokens of output. With lean, it writes a 54-line faker script instead. 372 tokens.
Without lean: ~66,320 tokens — about $1.00 at Claude Sonnet API pricing. With lean: ~372 tokens — about half a cent. Same result. 178× the cost.
That's not an edge case. That's the default behavior of every AI that hasn't been taught to think first.
Task Greedy Lean Multiplier
500 fake user profiles ~66,320 tok ~372 tok 178×
File rename script ~725 tok ~19 tok 38×
Email validation ~1,675 tok ~93 tok 18×
Airport code lookup ~1,710 tok ~93 tok 18×
Bug fix — parse_date ~962 tok ~61 tok 16×
Phone number input ~1,525 tok ~98 tok 16×
Recent searches ~1,010 tok ~73 tok 14×
Live currency conversion ~1,795 tok ~134 tok 13×
Dark mode toggle ~962 tok ~117 tok 8×
Business day calculator ~410 tok ~58 tok 7×
Deep clone fix ~287 tok ~40 tok 7×
City autocomplete ~2,460 tok ~410 tok 6×
Rate limiter — sliding window ~2,152 tok ~414 tok 5×
User auth setup ~967 tok ~190 tok 5×
Pagination ~995 tok ~203 tok 5×
Console.log for debugging ~419 tok ~106 tok 4×
PDF invoice generation ~4,281 tok ~2,281 tok 2×
These seventeen tasks — a normal vibe-coding afternoon — cost 88,655 tokens greedy vs. 4,762 tokens lean. That's a $1.10 difference, every time, without changing a single prompt.
Real outputs from 17 benchmark scenarios, tested independently under three conditions each: think-twice only, surgical only, and both combined. Three-way breakdown →
The gap isn't narrow. Across 17 real tasks — bug fixes, scripts, API integrations, data generation — savings range from 2× to 178×, with a median of 8×.
That spread exists because the waste doesn't come from one place. There are two independent failure modes.
Scope creep is Claude adding what you didn't ask for — --dry-run flags, docstrings, error handling, test suites — on top of a task with a fixed, bounded answer. The task is small; the creep is not. surgical catches this.
Wrong strategy is Claude picking the expensive path when a library, API, or built-in already solves it correctly and completely. 124 airports hardcoded when there are 10,000. A holiday set that expires January 1. Hand-rolled deepClone when structuredClone() is a built-in. think-twice catches this.
These aren't variations of the same problem — a task can trigger one, both, or neither. Which is why the skills are separate.
surgical catches more scenarios by count. think-twice catches the expensive ones — the 178× outlier lives in that slice. When both failure modes are present, the multipliers stack.
One honest caveat: in 3 of 17 scenarios (dark mode toggle, pagination, user auth setup), surgical alone outperformed both skills combined. When think-twice redirects to a library whose setup boilerplate exceeds a minimal hand-rolled solution, adding it hurts. The skills are not always additive — which is why they're separate, and why the full three-way breakdown shows every condition.
Real-World Examples
"Generate 500 realistic user profiles for our staging database"
Greedy Lean
Approach Writes 500 JSON records inline 54-line @faker-js/faker script, parameterized
Tokens ~66,320 ~372 — 178x fewer
Data quality Repetitive (~30 names recycled) Statistically varied, 50+ locales
Bcrypt hashes Fake hashes — not login-usable Real hashes — login-usable
Re-runnability Zero — ephemeral output Seeded, version-controlled, --count flag
Checkpoints — think-twice #2 (faker) + #3 (500 static = wrong shape)
"Write a script to rename all .jpeg files to .jpg in this directory"
Greedy Lean
Output 110-line CLI — argparse with --dry-run, --recursive, --verbose, --directory, logging setup, per-file try/except, renamed-file counter, type hints, main() guard 3-line pathlib loop
Tokens ~725 ~19 — 38x fewer
Flags added 4 (--dry-run, --recursive, --verbose, --directory) 0
think-twice Correctly does not fire — pathlib is already the right tool —
Checkpoint — surgical — user asked for a script, not a CLI tool
"Add email validation to our signup form"
Greedy Lean
Approach RFC 5322 regex + 65-entry disposable domain blocklist + live MX/SMTP probe + lru_cache 4-line compiled regex, stdlib re only
Tokens ~1,675 ~93 — 18x fewer
Live network call On every validation (SMTP probe) Does not exist
Strings to maintain 65 hardcoded disposable domains 0
Dependencies smtplib, socket, logging, lru_cache None beyond stdlib
Checkpoint — surgical — "validate email" ≠ "build a validation module"
"Map airport IATA codes to city names for our flight search"
Greedy Lean
Approach Hardcodes ~124 airports as a static object npm install airports + 5-line lookup
Tokens ~1,710 ~93 — 18x fewer
Airport coverage 124 of ~10,000 IATA codes (1.2%) All ~10,000
"TXL", "CGK", "DOH" Not found Covered
Correctness Wrong for 98.8% of airports Complete
Checkpoint — think-twice #2 — existing package
"Fix the off-by-one error in parse_date"
Greedy Lean
Output Bug fix + type annotations + input validation + docstring + 13 unit tests + logging The one-line fix, nothing else
Tokens ~962 ~61 — 16x fewer
Reviewability User must audit 3,847 chars they never requested User reviews exactly what they asked for
Result: "Fixed the off-by-one on line 5 — removed the + 1. Didn't add validation or tests; let me know if you want those."
Install
Option 1 — CLAUDE.md
Add this to your project's CLAUDE.md. Unlike skills, CLAUDE.md is always in context — no reliance on Claude's judgment about when to apply it:
Before any substantial coding task (new feature, data generation, implementation over ~20 lines): pause and check — does a public API, package, or one-liner already solve this? If yes, use it. Only then proceed with the minimum that solves the problem today.
Before writing each code block: build only what was explicitly asked for. Do not add error handling, tests, type annotations, docstrings, or abstractions unless requested. If something seems worth adding, say so after delivering the output — don't add it unilaterally.
Skip both rules for: bug fixes under ~10 lines, infra/terraform/k8s, DB queries, or when the user explicitly asked for a complete or production-ready implementation.
Option 2 — Claude Code skills
Skills load their full rulebook when invoked manually or when Claude judges the context matches. Better for on-demand use or projects where you don't want these rules active at all times.
Via plugin system:
/plugin marketplace add albertobarnabo/lean /plugin install lean@lean
First register the repo as a marketplace, then install the lean plugin from it. The @lean suffix is the marketplace name. Restart your session afterward so the skills load.
Via curl (installs both skills):
BASE="https://raw.githubusercontent.com/albertobarnabo/lean/main/skills" for skill in think-twice surgical; do curl -sL "$BASE/$skill/SKILL.md" -o ~/.claude/skills/$skill/SKILL.md --create-dirs done
Single skill only:
think-twice only
curl -sL https://raw.githubusercontent.com/albertobarnabo/lean/main/skills/think-twice/SKILL.md \ -o ~/.claude/skills/think-twice/SKILL.md --create-dirs
surgical only
curl -sL https://raw.githubusercontent.com/albertobarnabo/lean/main/skills/surgical/SKILL.md \ -o ~/.claude/skills/surgical/SKILL.md --create-dirs
Manual invocation (force a skill on a specific task):
Command What it does
/lean:think-twice Run the full lean checklist before starting
/lean:surgical Implement with zero scope creep — exactly what was asked
When NOT to apply
These skills are not dogma. Override them when:
Situation Why to override
Security-critical code Always use stdlib or a widely-audited library — never a shortcut
Latency-sensitive hot path A runtime API call adds unacceptable delay
Offline-first / zero-dependency env External solutions not available
The shortcut is the overkill Don't add a library for 5 trivial lines
You explicitly asked for extras surgical doesn't apply when scope expansion is the request
In all cases, Claude proceeds — and states why it's overriding.
The Philosophy
Lean thinking is not about doing less carelessly. It's about doing exactly what creates value — and cutting everything else before it costs you.
Steve Jobs wasn't romanticizing laziness. He was describing the highest form of engineering judgment: the discipline to stop before the obvious path, find the clever one, and take only that.
Most AI coding tools optimize for doing more. They generate thoroughly, completely, defensively — because generating is what they're good at.
lean optimizes for doing right. Two questions, two moments, before the tokens flow:
Is there a smarter path? Is this exactly what was asked?
That's it. The rest follows.
Contributors
@albertobarnabo — author
@ayoubighissou99 — co-author
Contributing
Found a new waste pattern — a task where Claude defaults to the expensive path when a better one exists? Open a PR:
A new shortcut row in an existing skill's table
A new skill for a pattern not yet covered
A real token-cost comparison from your own usage
The best contributions, like the best code, are the ones that do exactly what's needed — nothing more.
MIT License
About
Teaches Claude to find the clever path before taking the obvious one. 8× fewer tokens on the median real-world task — measured across 17 benchmarks.
Topics
skill
claude
cost-saving
llm
anthropic
ai-productivity
claude-code
token-optimization
token-efficiency
claude-skills
think-twice
Resources
Readme
License
MIT license
Uh oh!
There was an error while loading. Please reload this page.
Activity
Stars
17 stars
Watchers
0 watching
Forks
1 fork
Report repository
Releases
No releases published
Packages 0
Uh oh!
There was an error while loading. Please reload this page.
Contributors
Uh oh!
There was an error while loading. Please reload this page.