Stop Asking Claude to Agree with You
This article introduces Matt Pocock's /grill-me skill, which forces developers to clarify requirements through relentless questioning, avoiding wasted effort on vague ideas. It details the four-step workflow from /grill-me to final implementation, along with tips to keep Claude Code efficient.
The best claude skill I’ve found writes no code. It just won’t let me start until I know what I’m building.
Why Plan Mode isn’t enough
For a long while, my default move with any non-trivial feature was Claude Code’s plan mode. Sketch the idea, let it draft a plan, skim it, hit go.
The cracks showed up fast.
A fuzzy ask produces a fuzzy plan. The output looks structured and confident, but it’s quietly inheriting every gap in your thinking.
The agent drifts mid-execution. You approve a plan, switch to execute, and it quietly adds files and patterns the plan never mentioned. You end up appending “implement exactly as written, nothing extra” to every run.
It never challenges the idea itself. Plan mode answers how Claude would build your thing. It never asks whether the thing is worth building, or whether you’ve actually thought it through. You can burn three days producing a clean, well-structured mistake.
That last one is the real gap. Plan mode assumes your idea is already sound and your only problem is execution. Most of the time, that assumption is wrong — and that’s the exact gap /grill-me fills.
The skill that argues back
Matt Pocock — the TypeScript educator behind Total TypeScript — opened up his personal .claude directory as a public skills repo on February 3, 2026. Within three months it had crossed 50,000 GitHub stars, peaking near the top of GitHub’s trending charts. Not a framework, not a wrapper — just the actual markdown files he uses every day.
The repo targets one failure mode: misalignment. You think the agent understood you. Then you see what it built, and it clearly didn’t. The same communication gap you get with a human dev, now at machine speed.
The headline fix is /grill-me. It removes vagueness before any code exists, by interrogating your plan one decision at a time. The entire skill is three sentences:
Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one by one.
That’s it. No clever machinery — the value is entirely in being forced to answer.
Getting it running:
npx skills add mattpocock/skills
Then run /setup-matt-pocock-skills once per repo to wire up your issue tracker and config.
The questions you didn’t want to answer
Take a real feature — per-workspace billing limits in a multi-tenant SaaS. You sketch the idea and type /grill-me.
Claude: When a workspace hits its limit, does the whole workspace get blocked, or just new usage?
Me: Just new usage, I think.
Claude: Who gets notified — the user who triggered it, the workspace admin, or both? Through what channel?
Me: Admin. Probably email.
Claude: If the admin ignores that email and a user hits the wall mid-session, what’s the experience? Hard error? Degraded mode? Graceful message?
Me: …I hadn’t defined that.
Claude: That’s your entire degradation story. It shapes the API contract, the frontend, and what the billing page has to show. Let’s keep going.
One rough idea. A handful of decisions that reshape the architecture before implementation, instead of surfacing as bugs three days in.
Every answer you give in a grill session is a decision you won’t have to reverse in code review.
The pipeline
FIG. 01 · The four-step workflow.
The whole thing runs as a single continuous session, each step feeding the next.
/grill-me → /to-prd → /to-issues → /afk
The rule that makes it work: never /clear between steps. The PRD skill leans on everything from the grill — your answers, your reasoning, the edge cases you named, the trade-offs you accepted. A PRD generated cold is a template. A PRD generated right after a grill is a spec with a point of view.
Tight PRD in, tight tickets out — /to-issues produces tasks with real context and acceptance criteria, not “implement auth.” Tight tickets mean /afk can execute without guessing.
There’s a second payoff that shows up later. The PRD and the issues are written artifacts — they live in your tracker, not just in a chat window. So when a session dies halfway, or you come back to the same feature a week later, you’re not reconstructing intent from memory. You point a fresh session at the PRD and the open issues, and it picks up with the full reasoning already in hand: what you decided, why, and what’s left. The grill happens once; the context it produces keeps paying out across every session that touches the feature.
Output quality at the end is a direct function of honesty at the beginning.
When the grill gets hard: /handoff
Deep grill sessions burn context. You’re well into one when you realize you need to prototype something or check how an existing schema behaves. The instinct is to do it right there in the same session.
Don’t.
Pocock also wrote /handoff, built for exactly this. It compresses the current session into a document a fresh agent can pick up — context, decisions, intent, suggested next steps. Your grill stays clean; the side quest gets its own window.
Two patterns:
Fire and forget — something out of scope appears. Hand it to a fresh agent, return to the grill.
Grill → handoff → prototype → handoff back — validate an assumption in a throwaway session, bring the learning home.
The harder the questions, the more context a grill accumulates — and the more valuable it becomes to protect it. /handoff is how you keep going hard without drowning the session.
The deeper the grill, the more there is to lose by derailing it. A handoff lets you chase a tangent and come back to a session that’s still sharp.
Skills worth adding to the stack
/tdd — a red-green-refactor loop that builds one vertical slice at a time: failing test, minimal implementation, refactor. If your grill and PRD were thorough, the acceptance criteria already exist — this turns them into tests Claude has to satisfy before it can claim something works.
/zoom-out — when you land in an unfamiliar stretch of code and can’t see how it fits the bigger picture, this pulls the agent up a level: it explains the section’s role, its dependencies, and how it connects to the rest of the system. Useful when you inherit a module, come back to old code, or need the lay of the land before changing anything.
/grill-with-docs — a /grill-me variant that tests your plan against the existing domain model and codebase. The right choice when you’re extending a live system rather than starting fresh; it ties new decisions to what already exists and updates context docs inline.
Beyond the grill: getting Claude Code to run lean
The grill is the star of the workflow, but it only pays off if the rest of your setup isn’t quietly bleeding tokens and context. These are the habits that keep Claude Code fast, cheap, and sharp around it.
Run grill sessions on the strongest model; implement on a cheaper one.
A grill is pure exploration — no file writes, no tool-call overhead — so a frontier model like Opus earns its cost here by asking sharper questions and catching contradictions a smaller model misses, and the token bill stays low because it’s just conversation. Once you cross into implementation — /afk, /tdd, execution — a model like Sonnet does the heavy lifting at a fraction of the price.
Spend your best model where decisions are made, not where code is typed.
.claudeignore cuts your context bill in one edit.
By default Claude pulls everything in your project into scope. A .claudeignore file (same syntax as .gitignore) stops the dead weight from auto-loading — excluding .next/ alone can trim 30–40% off context in a Next.js repo. It’s the highest-leverage two minutes you’ll spend.
node_modules/ dist/ build/ .next/ pycache/ *.lock .git/ *.db
This doesn’t stop Claude from reading these files when you explicitly ask — it just keeps them out of automatic exploration.
Keep files under 200–300 lines.
Agents read whole files. When one reads a 1,500-line file to find a 20-line function, it pays for 1,480 lines of noise — and reasons worse for it. This isn’t a prompt trick; it’s a codebase discipline that compounds across every read in a session.
A large file is a tax every agent pays, every time it looks.
Teach Claude to grep, not read — in CLAUDE.md.
CLAUDE.md is the one file Claude reads at the start of every session, so it’s where your project’s standing rules live — stack, conventions, and the things you never want to re-explain. The highest-value rule you can put there: tell it which files to search instead of read. We have a translation-keys file with over 2,000 entries; reading it whole would torch the context budget on noise. So CLAUDE.md says:
grep in packages/survey-type-defs/src/appTranslation/AppTranslationKeyEnum.ts
— 2000+ keys, never read the full file
One line, and Claude stops pulling a giant file into context every time it needs a single key. Do this for every oversized-but-unavoidable file in your repo.
Disable MCP servers you aren’t using.
Historically, Claude Code would load MCP tool definitions directly into context, and a few heavy servers (Playwright, GitHub, Gmail, etc.) could quietly consume thousands of tokens each — and a sprawling setup, tens of thousands per turn. Recent Claude Code versions mitigate this with Tool Search, now enabled by default: tools are discovered and loaded on demand instead of shipping every schema upfront. It helps, but large MCP setups still add overhead and complexity. Run /context to inspect what each server is costing you, and prune unused servers with /mcp.
Compact early — around half-full, not when it breaks.
Quality degrades as the window fills; by the time responses feel hazy, you’re already deep in the rot zone. The practical sweet spot is compacting around 50–60% capacity, while Claude still has full, uncompressed context to summarize from. Waiting until 90% means you’re summarizing an already-degraded view.
Git worktrees for parallel agents.
Independent features don’t need to run one after another. Worktrees give each agent its own branch and isolated directory off the same repo, so three agents can do in an hour what would take three sequentially — no context collisions, no merge chaos.
Thinking is the new bottleneck
The shape of building software has flipped.
Writing code used to be the bottleneck — now it’s the part agents are best at. What’s left for us is deciding what’s actually worth building and being precise enough that the implementation doesn’t drift.
That’s why the highest-leverage skill in this workflow is /grill-me — a skill that generates no code at all, yet creates more leverage than anything else.
The leverage has moved upstream. Specificity, edge-case honesty, and tight context management matter more than raw output.
Everything else — lean context, smaller files, the right model for the right phase — is just there to protect that thinking from noise.
The grill comes first. The code comes second.