Software Architecture After AI
This article examines how AI dramatically reduces the cost of reversing code-level decisions, thus redefining the boundaries of software architecture. The author argues that many previously architectural decisions (like module structure, framework choice) are no longer architectural, while data architecture, service boundaries, and user trust remain difficult to change. AI also elevates the importance of observability and business strategy alignment.
Article intelligence
Key points
- AI collapses the reversal cost of code-level decisions from months to days, moving them outside architecture.
- Data architecture, trust, and service boundaries remain architectural because the hard part was never the code.
- Observability and business strategy become more critical as AI accelerates development.
Why it matters
This matters because AI collapses the reversal cost of code-level decisions from months to days, moving them outside architecture.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Photo by D Brz on Unsplash
One of the more useful definitions of software architecture comes from Building Evolutionary Architectures: architecture is definitionally the stuff that’s hard to change.1 I’ve always found this definition to be the most honest framing available, to say nothing of the simplest. It doesn’t pretend architecture is about beauty or correctness or your resident architect’s favorite stalking-horse. It acknowledges that what makes a decision “architectural” is not its conceptual weight but its cost to reverse and its business impact. And “hard to change” has always been, at root, about wall-clock time: coordination cost, incident mitigation, cognitive load, handoff friction. Software architecture has always been a labor problem dressed up as a design problem.
AI has now collapsed the wall-clock time required to make substantial code-level changes. Things that used to take months now take days. What happens to architecture when the cost to reverse most code-level decisions drops by something like an order of magnitude?2
What happens is that the boundary of what counts as software architecture moves, in some cases dramatically. Most code-level decisions are no longer inside it; their cost to reverse has collapsed, and the consequences of getting them wrong are measured in days of rework rather than months. What stays inside is data, service boundaries, and user trust, which remain hard because the hard part was never the code. And a few concerns crowded out by code-level debates now stand in sharper relief; observability in particular deserves a reconsideration in the pantheon as the rate of feature delivery increases.
When every line of code was clean, and real engineers refactored
For decades, code-level decisions were legitimately architectural. Languages, frameworks, module structures, and persistence strategies were decisions worth debating and committing to, because revisiting them could cost a staggering amount of time, and had long-term implications for the productivity of the team. Changing these things could cost months or even years of time and effort, and companies lived and died in the time it took large firms to reverse course. Even Refactoring was predicated on the idea that code-level change was possible but costly; restructuring code took skill and real time, and you needed techniques to manage that cost3.
But software practitioners have been collapsing architectural decisions into routine ones for decades. The effect of leaning into pain, rather than avoiding it is to incentivize teams to build tooling that addresses it, turning what used to be architecture into something a general-purpose engineer handles as a matter of course.
Before database migrations were commonplace, schema decisions were irreversible, and presumptively architectural; they often required DBAs to orchestrate them4. Then Pramod wrote Evolutionary Database Design, migrations got folded into every major framework, and the DBA role started to become less visible. The judgment and expertise they provided were real (and substantial), but its market value was inflated by the mechanical bottleneck. Once the bottleneck was removed, the costs of a dedicated gate became more visible and the judgment got absorbed into the general engineering role, which gave many teams more leverage. Continuous delivery did the same thing for deployment and release engineers. There may be no silver bullets (until now?), but each small shift in tool efficiency took a category of decision that used to require a specialist and made it mechanical. Each revealed that a specialized judgment was often general engineering skill trapped behind mechanical cost.
AI is simply the latest example of this, but it’s the most dramatic, because it collapses most remaining code-level decisions at once rather than one category at a time. A recent personal example: I wrote my (now-dead) startup against a NoSQL database whose vendor was also a startup, which (surprise, surprise) also died. I pointed Claude Code at it, gave it some guidance, and it ported the entire data layer to a conventional RDBMS, essentially flawlessly, in hours. I know this sort of thing has become commonplace, but it still surprised me: between the tedium of the work and my day job, I might never have accomplished it before the heat-death of the universe.
This is not an isolated example. Cloudflare’s team reimplemented 94% of the Next.js API surface in under a week for roughly $1,100 in API costs. Christopher Chedeau ported 100,000 lines of TypeScript to Rust in a month. Many of you have experienced similar shifts.
In some ways, these examples prove the rule that good structure matters: I built my data layer against a clean interface boundary, because I didn’t start writing code yesterday, so in some sense, of course swapping the implementation was straightforward. But even without a clean boundary, the change is fundamentally mechanical: find all the call sites, change all the implementations, verify correctness. More tokens, more time, and yeah, more human intervention, but we’re not talking about a vast difference; maybe days instead of hours. And the second-order effect of agentic development is that you can automate verification on top of it; you can build correctness-checking into the process itself.
These examples are admittedly biased toward the easy case: clean interfaces, well-defined boundaries, mechanically verifiable correctness. Systems with subtle semantics, unclear boundaries, and deeply entangled business logic remain harder to change, even with AI. But the trend line matters. AI does not erase coupling, migration risk, or rollout complexity, but it does demote a lot of code-shaping decisions that used to feel permanent, and make the rest dramatically easier to monitor and fix. That places a growing category of change squarely in the territory of “not architecture anymore.”
If code has moved outside the boundary, what’s still inside it?
It’s definitionally ridiculous to create a comprehensive taxonomy of software architecture, but I’ve broken out six categories below that I think capture the shape of the shift, and attempted to classify them along two axes: consequence of getting a decision wrong, and cost to reverse it. This is gut-level stuff, not hard data, so bear with me; I’m just trying to visualize the shift.
DecisionStatusWhy
Local code structure↓ DemotedModules, frameworks, persistence, integration wiring. AI makes mechanical restructuring cheap; getting it wrong now costs hours, not quarters.
Scalability and deployment posture↓ DemotedInfrastructure topology and performance strategy. Still harder than code, but within reach of routine engineering with AI-assisted tooling.
Data architecture→ Still architecturalOwnership, consistency, schema evolution. Data has gravity; the hard part was never the code, and it hasn't meaningfully moved.
Trust and service boundaries→ Still architecturalSecurity posture and the contracts downstream consumers depend on. Breaches are effectively irreversible; contracts bind organizations, not just systems.
Observability and behavioral verification↑ ElevatedCode volume is up and comprehension is down. Verifying behavior is how you catch what you can no longer read.
Business strategy and capability alignment↑ ElevatedAlways high-consequence; now finally visible. With code-level debates cheaper, architects have headspace for the question the work exists to answer.
Three movements, for three distinct reasons.
Some decisions got demoted because AI collapsed the cost to reverse them. Local code structure (modules, frameworks, persistence, integration wiring) used to command serious architectural attention because reversing a bad decision could cost months of calendar time; they don’t have to anymore. Scalability and deployment posture followed: infrastructure topology and performance rework are still harder than ordinary code, but within reach of routine engineering with AI-assisted tooling. These decisions still require judgment, but that judgment is no longer trapped behind mechanical cost, and the consequences of a bad call are measured in hours and tokens rather than quarters and headcount.
Some decisions stayed put because the hard part was never the code. Data architecture (ownership, consistency models, schema evolution) didn’t move because data has gravity: it accumulates mass over time, and more things depend on its current shape than anyone can enumerate. Trust and service boundaries didn’t move either: security breaches and contract violations are effectively irreversible (though large corporations get away with a shocking amount of this), and reversing them requires coordinating with human beings, reshaping accumulated state, or undoing real-world consequences that code changes cannot reach.
Two decisions got elevated, for distinct reasons. Observability and behavioral verification rise because volume is rising: if defects per line stay constant but code volume quintuples, the consequence of failing to verify what the system actually does rises with it. Furthermore, if line-level comprehension similarly collapses (as in a dark software factory), you need to be able to verify what the system does regardless of whether you understand every line. The implementation of monitoring is cheap; the decision about what to watch and how to verify behavioral correctness is not. Business strategy and capability alignment, by contrast, didn’t move on the chart at all; they were always high-consequence, and reversing a strategic misstep has always been expensive. What changed is that architects finally have the headspace to engage with them. With code-level debates cheaper, the question of which boundaries create competitive advantage is no longer crowded out by framework arguments.
You could reasonably argue that code structure is the enforcement mechanism for precisely the things I’m calling architectural. Good module boundaries help enforce API contracts; good type systems help protect data invariants. If you stop caring about code structure, don’t you risk undermining the contracts you claim to care about? I think this confuses the decision with its implementation. The shape of the contract, and its guarantees, are the hard part; changing them costs real time, because you have to coordinate with human beings to do it. The code that enforces them is implementation, and implementation is now cheap. You can swap out the enforcement mechanism without touching the contract it enforces; that’s what my startup migration work did.
There’s a related objection worth engaging: mid-level design decisions (“should this business logic live in service A or service B?”) accumulate over time into the overall malleability of a system, and those accumulated decisions are genuinely hard to untangle. This is true, but it’s always been true. It was never centrally controllable in the first place: teams generally put shit where it seems like it should go, optimizing for local autonomy and throughput no matter how much you try to govern it centrally, and the result is always some degree of drift. What’s changed is that an LLM strapped to a codebase search index (which is rapidly becoming table stakes) can actually find all of it, reason about how it ended up there, and help you reorganize it. The accumulated impact of mid-level decisions, while still important and probably still architectural, is more tractable with AI, not less; the cost of untangling it has dropped, even if it hasn’t vanished.
Pattern amplification is not destiny
You will hear the counterargument that AI makes code quality more important, not less, because it amplifies both good and bad decisions at vol
[truncated for AI cost control]