Tilling the Garden: Use AI differently to make interesting and useful apps
Mike Caulfield introduces Plot.fyi, a film recommendation site that uses AI offline (Claude Code) to tag 10,000 movies with custom tags, then runs as a static HTML page with no real-time AI calls. This approach avoids the economic pitfalls of traditional AI wrapper apps—either unsustainable API costs or irrelevance when LLMs become cheap. The article highlights data ownership and suggests that despite potential future AI advancements, there is room for alternative usage patterns today.
Mike Caulfield
Jun 04, 2026
I’ve been talking recently about Plot.fyi, my new film recommendation site. That’s partially because it’s pretty neat, and has resulted in me finding lots of great films. But the most interesting thing about the site is how it uses (and doesn’t use) LLMs.
Everyone knows the way an AI-driven site is supposed to work. You have a user who asks a question, like “What are 10 films like The Hateful Eight?” That question gets taken from the user and embedded behind the scenes into a larger prompt that provides some instructions about values, methods, and sources, and is fed to one of the Big Three LLM models. You get an answer back. If the prompt is good and some well-chosen post processing is used to check and ground it, you hopefully get something useful. As the models get better, you need less of a wrapper and less post-processing, supposedly at least.
With this approach, the core value of the exchange is in the model itself, assisted by the wrapper. But as many people writing these wrapping applications have discovered, if you live and you die by the model you hit a bit of a Catch-22:
If what the model does is expensive, you pay for each and every request at an unsustainable rate for scaling. You’re the front end for a model, and you absorb the risk and the makers of the model make money whether you succeed or not. As they say in the gambling industry, the house always wins. And in this case the LLMs are definitely the house.
If what you’re doing becomes cheap and easy, you’ve got a different problem, because now no one needs you at all. The hope that the cost will come down for your backend functions is a bit of a monkey’s paw wish. If it doesn’t come down you bleed money. If it comes down too much people will just do it in the LLM, which might be even a bigger problem.
The brutal economy of wrapper applications is part of the reason I’ve generally circulated reusable prompts rather than build wrapper sites. At least with prompts I’m not absorbing the risks of investing in something where I’ll pay a tax per user until it’s made irrelevant.
Back to the digital garden
Is there a way to avoid this conundrum? For some set of things, yes. You can extract the value out of the LLM and put it into content you control and cultivate directly.
This is a bit of an ask, but I’m going to ask you to go to Plot.fyi for moment. Click around it for about half a minute. Thirty seconds. Understanding what the site is — even just having a thirty second understanding — is crucial to understanding my next point.
In case you’re wondering, yes, I do think that the blood-soaked Tarantino film The Hateful Eight is in fact parallel to Murder On the Orient Express in many ways (though a perfect response I think would surface And Then There Were None). It’s actually these sorts of connections that excite me.
Ready?
If you clicked around, I imagine you think it looks like any number of AI/ML sites that select films by taking an input (a film you like) and find relevant matches by feeding that information with a prompt wrapper to AI.
In fact, while I built it using AI, the site itself uses no AI at all.
It doesn’t even use a server. It doesn’t have a database. There is no backend.
The entire application is HTML and JSON. There is an initial hit to bandwidth in the form of a 1.9MB data file.1 After that the program runs entirely in your browser, consuming the practical equivalent of zero CPU cycles.
If you want to visualize the size of that data file, here’s a 1990 ad for a 1.4MB floppy drive:
How is that possible? How can a program act like an AI recommendation engine for over 10,000 films and use no AI?
The key is using modern AI, but using it in ways we used to use AI in the past. I’m not saying the following approach works on everything, but for some subset of things you can apply it. I detail it here
An internet of weird little things (without the AI tax)
The way the site works is this. I spent the past two months using Claude Code to tag a dataset of 10,000 films with a variety of tags I invented that flag certain plot elements.
This shouldn’t seem radical. We used to enrich datasets with tags a lot before we got these powerful systems that could synthesize information without needing our own rich, tagged data. (Researchers still do, all the time!)
What’s new is the scale of things you can accomplish with a model the size of Claude Opus. Ask it for the themes of a relatively little known film from the 1940s, and it will spit out fairly accurate bullet points for you without needing to fall back on search. Ask it if any of twenty listed films have a “redemption arc” for the main protagonist, it will, with impressive accuracy, point to the ones that do. Run it through all the romance films, 40 at a time and see if any of your romance tags fit.
So that’s what I did. Over almost two months I used Claude Code to run hundreds and hundreds of tags over thousands of films and wrote all those tags to the data files.2
After doing that across about 700 tags for 10,000 films you end up with 10,000 files that look like this. Here’s The Hateful Eight.
Here’s The Brother from Another Planet:
These files then get spooled out as a dictionary and a series of ID pointers, creating that floppy disk size data runtime:
When asked about a film like The Hateful Eight, the system tallies up tag intersections across 10,000 records, computed as shown below:
For film buffs I’ll note that the Kubrickesque part of Hateful Eight is mostly the one-point perspective. One thing I learned in this process is you want to set your net pretty broad on these to get better connections (otherwise you lose the connective tissue).
It then uses that computed score to float similar films to the top of the list. It takes less than a tenth of a second of computation, and because once you encode all this stuff in the data, you can actually do the film matching with a little bit of Javascript math.
Example
Let’s take Overboard, the Goldie Hawn/Kurt Russell film from the 1980s that is very creepy when you think about the premise for more than two seconds. As others have noted this isn’t a rom-com, it’s a horror plot!
Overboard: The 1987 film follows a narcissistic heiress who loses her memory after falling from her yacht. A mistreated working-class carpenter takes revenge by convincing her she is his wife and the mother of his unruly children. She navigates chaotic family life, ultimately coming to value many of its attributes.
So maybe there’s something there in that plot that you like, but you would like it without the whole abduction angle. Where else can you get it?
If you ask an LLM, the response is not bad. Here’s what it gives you back:
Films mentioned:
The Proposal
While You Were Sleeping
Maid in Manhattan
Sweet Home Alabama
Wedding Planner
Doc Hollywood
Coming to America
Kate and Leopold
The Prince and Me
Great. It’s about what a good Blockbuster employee would be able to tell you back in the day. Not just the average employee, the one you’d always seek out! Here’s the description of The Proposal. As the LLM has identified it’s got a similar vibe, but replaces false imprisonment with corporate bullying (an improvement!):
The Proposal: The 2009 romantic comedy The Proposal follows an overbearing New York book editor (Sandra Bullock) who, facing deportation to Canada, forces her long-suffering assistant (Ryan Reynolds) to marry her. The charade unravels when the fake couple travels to Alaska to meet his family, forcing them to confront their true feelings.
Now let’s see what we get with our broad simple tags and a search on Overboard:
We get a lot of the same stuff here. In this case, however, it’s generated by the tiniest bit of data. As noted above, this isn’t any billion parameter production. It’s about 11 tag intersections.
This feels very weird to me. How can such a small dataset get you there? But it’s not really the size of the data on the individual items that matters; it’s that when numerous tags that represent as little as 5% of the data set intersect across a database of 10,000 records, interesting stuff floats to the top. That sort of scale of data enrichment used to be out of reach for average people, but now anyone can maintain and enrich a collection that size.
That means a couple things.
First, as mentioned, we can run the whole site Plot.fyi as a serverless HTML page reading JSON data. No AI, no server. Serving results for over 10,000 films. We use the LLM to build the product but escape the LLM tax to run it.
Second, I was able to get what I feel are better responses that I do with an LLM, at least in terms of what I think a good answer is.
Plot.fyi, for example, identifies I Married a Witch (an old Veronica Lake film) as a parallel. While difficult to explain, this is a) correct, and b) a connection that does not seem to have been made by anyone online before. In I Married a Witch, a witch forms a relationship with a man to extract revenge (like Russell) for mistreatment the target is unaware of (like Hawn) and the witch grows during the course of the revenge to fall in love and value “ordinary” life (like Hawn, in a bit of a switch-up, with the revenger having the realization).3
This is a different sort of match than The Proposal. If you search The Proposal and Overboard you’ll find a dozen pages talking about these films having the same vibe:
That’s not surprising; this sort of web text is a lot of what the LLM was trained on after all. In contrast, you will not find a page on the internet that compares I Married a Witch and Overboard. There are multiple reasons the JavaScript finds this whereas the LLM does not, but a big one is standard code with well tagged data is better at mapping a complete space than prompting. We can loop over every record in the database and check it against the input film in a way that would take hours and cost hundreds of dollars to do with a prompt. The cost to us with the V8 JavaScript engine in Chrome? Essentially nothing, and less than five milliseconds of processing.
The final thing is this. You own that data.
I mean, I’ve mucked it a bit here because I’m me and have implemented it in such a way that anyone can grab my Javascript and grab almost two months of my tagging work for their own purposes. I’ve developed openly available prototypes for 30 years, it’s hard to stop.
But that sort of approach is not required. You could build an approach like this into a database that no one can access directly if an open approach is not your thing. It’s up to you: feed the value back out to the community, or keep it as your competitive advantage. Either way the value is not in the model, it’s in the data you enriched with the model.
Maybe this is temporary, but so what
It’s possible that in six months this post will sound cute. A lot of smarter people than me think what the LLM will be capable of soon will so far outstrip other approaches that building things that don’t primarily rely on request-time AI compute will be silly. Under that view, little experiments like this where we turn AI around and use it to enrich the Commons or try other weird things will be historical footnotes at best.
Maybe.
But to quote another film, “OK.”
Maybe we’re on this ride down this slope, and our uselessness in the face of the ever-growing capabilities is pre-determined. Maybe these little schemes are futile.
So what.
We’re here right now, we should live how we want to live. The future will arrive when it arrives.
I’m not saying what I’ve outlined here is the one right way to use AI. I’m not giving up on writing mega-prompts, or even writing the occasional Node-based wrapper. I haven’t found a new religion. If you’re reading this thinking my point is “all AI use must follow this external content enrichment pattern” you’ve read me wrong.
I’m just saying right now there’s a lot of room for exp
[truncated for AI cost control]