Disposable Software – How to Stop Worrying and Love the AI Code
The article explores the concept of 'disposable software' in the AI era, arguing that AI-generated code should be treated as disposable to accelerate development, much like mass-produced furniture replaced artisan craftsmanship. A case study demonstrates successful AI refactoring, and a 'Disposable Code Manifesto' is proposed with three pillars: intent, requirements, and safety.
Article intelligence
Key points
- AI makes software cheap and disposable, analogous to the industrial revolution in furniture.
- A real-world Rails project case shows how AI refactoring reduced code from 2000+ lines to 264 lines.
- The 'Disposable Code Manifesto' emphasizes code must do what you intend, meet requirements, and be safe.
- Spec-driven development and comprehensive testing are keys to trusting AI-generated code.
Why it matters
This matters because AI makes software cheap and disposable, analogous to the industrial revolution in furniture.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Disposable Software - How to Stop Worrying and Love the AI Code
April 3, 2026 · 20 min read
It’s pretty clear we’re in the “disposable software era”. Plenty of blog posts chatting about it:
The Disposable Software Era
Disposable Software: How AI is Redefining What Software Means
etc. Just google “Disposable software”
For the most part, though, the examples referenced are about small, bespoke solutions to particular problems, often short-lived and/or non-production coded stuff, or things like “LLM needs to do a thing so it spits out some python, executes it, and deletes it.”
We should be thinking of how “Disposable Code” in production helps us move faster and take advantage of the gazillions of dollars spent on AI. We long ago accepted that infrastructure should be disposable; why should we treat code like cattle, not pets?
Get over it
Some part of this argument is going to come down to “it just is, build a bridge and get over it.” This sucks. I wish that I could write terrible rails code and get paid $250,000 a year, but those days are quickly coming to an end. Maybe not today, maybe not tomorrow, but sometime soon. It doesn’t mean the end of software development (in the near termThe spectre of AI taking all our jobs has always been a thing - we just thought it was a lot further away than it appears to today. The problem has always been there, we just thought we had more time. Oh well. ). It just means that the nature of the work is changing.
The way we wrote code prior to AI was akin to how people made furniture back in the olden days: with a high degree of quality and craftsmanship. If you wanted a chair, you’d have to learn to be a carpenter (or hire a master carpenter). Seating for 100 people? And you want chairs? Well, sorry, no you don’t, because that would take a decade and infinite money and several people would die in the process. You get a long bench and fill your building with those.
Then the industrial revolution came about and all of a sudden we could produce chairs by the boatload. The price dropped dramatically, and yes, the quality wasn’t as good.
Check out this really fancy old-timey desk:
It costs $38,000 and arrives in 24 to 36 weeks.
Check out this desk:
It’s $209 and I can get it delivered today.
They do the same job, right? There’s a tabletop and some drawers to hold things. Papers. Business papers. But the $38,000 desk is for someone that wants an artisanal desk that someone broke their back to make and is willing to pay for it. The rest of us load up our blue shopping cart and eat the meatballs.
And now, humans went from artisans making hand-crafted code to operators of a non-deterministic AI machine that makes chairs code. So how do we take a non-deterministic token machine and produce high-quality, working code?
Plan to throw it away.
Throw it away?
We used to say that old code rusted. “Software rots” is a thing we’ve been talking about for a while. And we’ve all had an experience where you open up your editor, look at some code, scratch your head and say “Who wrote this garbage?” only to have git blame show you as the author.
Human thinking doesn’t scale. Large software doesn’t scale. Reading and understanding every line of code doesn’t scale. It never did, we have just been pretending. Let’s stop pretending.
We need to build a system with this in mind. Why?
A Simple Example
Back in the dark ages (late 2024/early 2025) at Studio Charter we needed a way for customers to get web access to onsite 4k video recordings.
This is complexified because:
The files live on a USB SSD attached to an onsite ATEM
ATEM only supports anonymous unauthenticated FTP access
The Mac needs appropriate credentials to store the files “somewhere else” (aka the Cloud)
Since the Mac is onsite with our customers - some of whom are coworking places - it could walk away at any minute, so each credential needs locked down to only be able to write to a specific per-customer per-video-studio storage bucket. It cannot read file data, delete anything, etc.
After the files are uploaded to this storage bucket, they then get moved to a final location after some post-processing occurs (verification/validation, de-duping/cleanup, etc.)
The user might be a guest who isn’t a member of the coworking space or an employee at the company that the studio is sitting in.
So we have to get files into a Ruby on Rails app that a person can click and download, or do light-duty editing on.
Since I’m using Rails for this we have some patterns. The normal approach, and how Rails’ ActiveStorage is architected, is that ActiveStorage handles the entire lifecycle of the file (initial upload, ActiveRecord object creation, etc.). However, it doesn’t make sense to have the Mac send its files to Rails - there could be dozens or hundreds of files, ranging in size from a few megabytes to tens of gigabytes. Throwing this at the Rails app is super expensive and failure-prone.
I needed a way to get the files off of the Mac, upload them to GCP, then somehow tell the Rails app that the files were there (or Rails could poll the directories) so that Rails could create the appropriate file objects. Back when I was doing this, I only had access to GPT4 and Claude 3.something, so I opened up chat windows and went back-and-forth with each, trying to get to a reasonable architecture.
We decided to do something like:
The Rails app will generate appropriate GCP buckets when lifecycle events occur (new customers, new studios, etc.)
:handwave: securely get the appropriate gcp keys on the Macs
use rclone to sync the files from ATEM via the Mac
GCP pub/sub would, once the whatever processing was complete and the files were moved into the final bucket, issue signed file bucket lifecycle event webhooks to the Rails app
The Rails app would create appropriate ActiveStorage file objects for each one
This seemed reasonable. How to write all this code? I used ChatGPT web UI directly to have it generate the Rails code:
GCP API calls to manage bucket creation etc.
Inbound GCP Webhook processing
GCP API calls for web file lifecycles (deleting the file, etc.)
Can I tell you? This code is terrible. We have the following Rails models:
GCPSync
GCPSyncItem
GCPFile
GCPConfiguration
This looks approximately fine, but when you dive into the details it’s written in a way that doesn’t make any sense. When the gcp_webhook_controller receives the ping, (via a background job) it sends the message to NotificationService, which creates a GCPSync and then a GCPFile directly for each file in the message.
We have a bunch of other service objects GPT created:
GCPConfiguration
GCP::BucketOperator
GCP::FileOperator
GCP::IAMService
GCP::RefreshService
GCP::SecretManagerService
The NotificationService takes the GCP file payload and creates a GCPConfiguration item, and passes it to the Bucket and File operators, then finally creates the GCPFile object and associates the GCPConfiguration with it.
According to wc -l there are:
1335 lines of Ruby in the GCP service objects (+1676 lines of tests)
952 lines of Ruby in the various GCP models (+983 lines of tests)
The way it creates objects is weird; the order of things is weird; and there are lots of “keep poking at it to make it work” kinda hacks (ie. adding dependency injection purely to make testing easier; a simpler implementation would’ve made the testing easier and not required DI) - things that might cause a human to go “huh, I’m working too hard to do this, maybe I should re-think things”.
I had Opus 4.6 re-architect this and it dramatically cut back on objects; the GCPConfiguration was (roughly) a few attributes on the GCPFile, etc. - the combined code it spit out was 264 lines of code and looks a lot more sane/manageable. We were able to re-use the bulk of the relevant test suite - the build-evaluate-fix-rerun loop was pretty straightforward because of this. I haven’t tested it since this is not speed-critical code but I’m sure it’s faster to execute, and there are fewer tests, so the test suite certainly runs faster.
What are the aspects of this example that made this AI refactor successful? How was I able to easily throw away the old code and replace it? How can I trust that the code isn’t terrible, leaking secrets, or a total disaster? What does it take to write disposable code?
Disposable Code Manifesto
Generally speaking, to be trusted, AI-produced disposable code needs to:
Do what you intend
Meet particular requirements
Be safe
Do what I intend
Waterfall was, for a dismal while, The Way to produce code. People would go in a deep cave and produce giant reams of paper describing every fine detail down to the semicolon. Humans were, essentially, very fancy typists that turned UML to zillions of Java classes.
We (rightfully) discarded this in favor of Agile development. Waterfall is bad because unshipped product is a liability - it’s muda. We need fast iterations with our customers to really understand their problems, propose solutions, and iterate, until we get it “just right”. Waterfall keeps us locked in our cubes until it’s too late.
Agile teaches us that long lead times are bad. That we need working software over documentation.
So how do we write code with LLMs? “Spec Driven Development” is, generally speaking, how we do it today.⊕Isn’t this just waterfall for AI?
Kinda feels an awful lot like the Waterfall of the past (or something like cucumber - which I could never enjoy writing).
SDD isn’t really waterfall in that sense. LLMs are fuzzy compilers of intent to code. We don’t have to specify things to the n-th degree. We don’t need to specify that passwords should be hashed.
Why not?
LLMs already possess the world’s knowledge of Java-garbage-boilerplate, standard libraries, design patterns, and tools. It has slurped up OWASP and - thanks to the dedicated folks at AI labs - been beaten into submission that “thou shalt not put plaintext passwords in the database.”
I can just tell it “This is a Rails 8 app, I need a login page, use a gem wherever possible” and it will happily add Devise and/or OmniAuth and it will have the appropriate has_secure_password.
Yes, it is a form of “big design up front” but because of the fuzzy nature, it’s more like “small design up front” and because the execution of the spec is trending to “free” (in terms of time/effort, not necessarily token cost) we can see the results of that immediately.
Meet Particular Requirements
We can’t just blindly accept LLM code without reasonable acceptance criteria. A spec, in this case, isn’t really the acceptance criteria because it’s far too vague. It’s a context doc, describing what not how. “I’m using Rails 8. When a GCP Webhook comes in, we need to process that message and ensure a File object is attached to the right Company and Studio. The JSON is {..}. We should follow standard Rails background job best practices and ensure that we handle failures gracefully.”
The requirements are implemented (currently - maybe someone will come up with an AI-first way of doing this differently) as tests. Functional, unit, integration tests help us describe the behavior of the system.
A comprehensive test suite is necessary to hand off implementation to an LLM. Necessary, but not sufficient. It’s one leg of the stool.
Be Safe
Specs are lossy. Tests can help us with behavior. But implementation is not really the place for us to put tests. Can we write a test that looks for has_secure_p
[truncated for AI cost control]