The Mental Models I Use to Work with AI
Mete Polat shares eight mental models for working with AI, ranging from practical prompting tips to meta-level insights about the AI industry. Key ideas include upfront alignment, rewinding over steering, giving AI the same tools, using bad outputs as taste signals, preferring visual input, building a reference library, design as antidote to slop, and adversarial review between LLMs.
Mete Polat
Jul 02, 2026
Hi. I’m back from my summer break that I spent traveling and tinkering. Now, back to our regularly scheduled programming.
This week, I wanted to do something a bit different. Instead of deep diving into one specific idea, I’m going wide and putting together a list of my favorite mental models that guide the way I work with and understand AI. The world of AI is opaque - both the models themselves and the rapidly evolving industry around them. Mental models are a way to map the territory so you can find your way through.
Most of these models took shape in my own head through trial and error. Others I stole from my favorite people. Some are new while others are references to ideas I’ve explored in detail previously. All of them earned a place in my carefully curated toolbox that I use to get the most out of AI and to understand the discourse around it.
Organizational note - I start with more practical mental models that will help you get better outputs when working with AI agents. As you go down the list, they get a bit more meta and less “applied” - the mental models that help me make sense of how AI is changing the tech industry and the discourse around it.
- Go for extreme upfront alignment
The more assumptions you make explicit up front, the less time you spend cleaning up weird agent behavior later.
There’s a famous computer science thought experiment that asks you to describe how to make a peanut butter & jelly sandwich in detail to a computer. The point is that it’s very hard, and it makes you realize how much of the process lives implicitly in your head. As you can guess, it’s meant to demonstrate how explicit you need to be with your instructions to get the expected output. Any assumption that remains implicit will lead to unexpected outputs.
This is even more relevant to working with LLMs because instead of erroring out, they will replace your implicit assumptions with their own, often without telling you. Your job is to catch as many of these assumptions as possible as early as possible. This is where the “upfront” part comes in - you should invest the most energy into your initial prompt & context that defines what it is you want. There’s a lot more ROI on your effort here versus trying to steer the AI later on when it’s going down the wrong track (see next mental model). In the end, you’ll likely actually save effort.
After the initial prompt, the best way to achieve alignment is through in-depth Q&A. I have this text snippet (which I found on X a long time ago) mapped to the “.ama” shortcut on my computer:
Interview me in detail using the AskUserQuestionTool about literally anything: technical implementation, UI & UX, concerns, tradeoffs, etc. but make sure the questions are not obvious.
I often drop this at the end of a very long initial prompt and then spend hours answering questions. Before the model starts building anything, I will have an extremely detailed spec with 99% of assumptions made explicit. When you start building, you should continue to extract latent assumptions through workflows like these:
Thariq@trq212
been asking others at Anthropic how they stay in the loop with Claude and fully understand the work being done
this is one of my favorites from Suzanne:
8:29 PM · Jun 1, 2026 · 1.38M Views
210 Replies · 689 Reposts · 10.5K Likes
- Rewinding > Steering
When an AI run starts wrong, restarting from better context usually beats patching over a bad trajectory.
When the initial output you get out of AI is not meeting your expectations, your immediate reaction is to try to steer towards what you actually want by providing negative feedback (”here’s what’s wrong”). While chat is the right metaphor for AI overall, it does trick us into working with it like we would with a human - if there’s a misunderstanding or an error, you overcome it or fix it. That’s just natural. You don’t just hit up your coworker Jeff in another empty chat window to retry the conversation you just had.
In reality, the process is often more akin to drawing - if you get too many initial lines wrong and lean heavily on the eraser, the canvas will bear the marks of your previous errors forever and will influence your resulting composition. The best solution in this case is to crumple the paper, toss it in the bin (go Knicks!), and start over.
LLMs are extremely susceptible to path dependence - “phenomenon of past events or decisions constraining or defining later events or decisions”. Their initial errors and your turn-by-turn desperate attempts to correct them compound into muddied context and duct-taped solutions.
So - if AI’s initial output is significantly far off from your expectations, you’re better off taking a step back and rewriting your starting prompt and the context around it. If the output meets your expectations but deteriorates as you continue iterating, you need to rewind (in Codex and some other apps it’s also called Forking) to the point before the deterioration happens and try again.
S/o to Juan Ramirez for clarifying this idea for me.
- AI Has The Same Tools As You
Agents can handle more of the setup, testing, and browser work if you give them the same tools you use.
One of the most annoying aspects of any project is all the instrumentation - getting the API keys, provisioning certificates, permissions, etc. At this point, the agents are capable of doing 99% of those things themselves through either a CLI (ideal), MCP (good), browser control and/or computer use (ok). So, for example, if you’re hosting on Cloudflare, make sure you install Wrangler CLI and sign in (or ask the agent to do it); if you’re using Supabase as a backend, install Supabase MCP & CLI; if you need to test a web / desktop / iOS app, Codex & Claude Code can do 99% of testing for you.
Oftentimes, you can find the most popular integrations in Codex’s plugin store and easily enable them from there. Lastly, if there’s no CLI or MCP available (ask your agent), I will now often sign in to whatever dev portal I need right inside the Codex browser and tell the agent to set everything up for me - it knows where to go, what to click, and what to type.
The point is - AI can do 99% of the things you can do on a computer, so let it - give it the right tools and ask it to use them.
- Bad AI Output Is A Feature, Not a Bug (Most Times)
Early bad outputs are not just failures; they are taste signals that help you find the shape of the thing.
The current internet discourse around AI is obsessed with “one-shotting” - the elusive experience of AI nailing your idea in one turn. At best, this is just engagement bait - cool-looking dashboards get likes. At worst, this is fantasy. In practice, like any first draft in any creative pursuit, your initial output will be bad.
But hold on, didn’t I just profess above that if your initial output is bad, you better start over with a different prompt? Yes, sort of. This depends on how early AI comes into your creative process. If your vision is fully crystallized and you have exact design mockups and PRDs, then AI failing to produce your vision up to spec is a prompting and context engineering problem.
On the other hand, if you use AI earlier in your creative journey to shape your vision, bad output is creative fodder. Your negative reaction is a signal, because it helps you understand what you do not want. In the early stages, this is often even more valuable than knowing what you want. “This doesn’t feel quite right” becomes information about your own taste and the creative direction you’re looking for.
Here, again, beware of path dependence - purely reacting to bad output can set you on an inherently wrong path. Use it to make novel connections, find new paths, and map the terrain.
- If You Want Specific Visual Output, Use Visual Input
When the target is visual, references beat long prose prompts almost every time.
Providing visual references is a more effective way to guide visual output, whether it’s front-end work or image & diffusion models. Giving an image model one reference of what you’re looking for is way more effective than writing a 1,000-word ultra-detailed prompt describing every detail (which is quickly becoming obsolete as models get better). If you often work with AI-generated images and videos and need to iterate to get to specific results, I highly recommend you try Flora.ai - it’s effectively this principle productized inside a node-based infinite canvas. And the best way to have visual references at the ready is to become a digital hoarder:
- Become a Digital Hoarder
Your reference library is becoming a machine-readable map of your taste.
In my previous issue, I argued that the best way to become a better AI-pilled designer is to become a digital hoarder and start collecting references across all modalities, not just text:
More importantly, as AI becomes more “omni”, this growing pile of references gets increasingly more valuable as it (a) encapsulates your taste and (b) serves as raw material for AI as you begin morphing your products. Imagine your taste as a very blurry image - as you collect more artifacts that align with your taste, you slowly increase the fidelity of this “taste image” for AI. And as you jump into new projects, instead of asking Claude to “make it more modern”, you will have a rich library of references to draw from. You can now steer the output away from slop and towards taste.
- Design Is an Antidote to Slop Adaptation
Design is how you keep models from collapsing into the same average-looking sludge.
This may be my 3rd or 4th time linking to this tweet, but here we go:
hilary gridley@yourgirlhils
today’s amazing new AI-designed artifacts will look like slop in a month, once everyone learns to recognize the patterns the model falls back on. like AI-generated writing, the output isn’t objectively “bad,” (in fact it is often technically quite good), but once it becomes
Claude @claudeai
Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude.
Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.
1:50 PM · Apr 18, 2026 · 62K Views
40 Replies · 38 Reposts · 587 Likes
Without opinionated guidance, every model, however amazing, will regress towards the same patterns that we will recognize as slop. As designers, we’ll always have a job in steering models away from it.
- Turn LLMs On Each Other
Use one model to pressure-test another so your review bandwidth is not the bottleneck.
The more you work with AI, the more you find yourself to be the bottleneck of the process - your work moves as fast as you can define new tasks and review the output. While we’re still nowhere close to stepping out of the loop completely, you should start using AI to critically review the output for you before you have to review it yourself. This is called adversarial review - when you get one LLM to critically review the work of another model, specifically focusing on finding flaws, vulnerabilities, edge-case failures, etc. As someone wrote on Reddit, “adversarial reviews are stupidly easy and unfairly useful”.
In theory, you could do this with one LLM and get it to review itself, and Claude Code already has presets like /simplify and /review. But using a different model for an adversarial review helps you avoid one model’s blind spots and apply two different “intelligence shapes” to the same problem, giving you a broader coverage overall.
When I use Claude Code, I like to use the /codex plugin, and performing an adversarial review is as easy as tacking this onto your prompt: “Perform an adversarial review on your implementation plan with /codex, resolve the critical issues, and keep reviewing until no issues come up”. For a more manual setup, ask your main model to write a prompt
[truncated for AI cost control]