AI News HubLIVE
站内改写6 min read

Context as Code

As syntax becomes cheap and abundant, architectural control becomes the scarce resource. Effective governance starts upstream, where intent, constraints, and threat models shape the agent’s working context before generation begins. The goal isn’t better prompting but build-time boundaries that prevent structurally invalid code from entering the system.

SourceO'Reilly AI & ML RadarAuthor: Artur Huk

As syntax becomes cheap and abundant, architectural control becomes the scarce resource. Effective governance starts upstream, where intent, constraints, and threat models shape the agent’s working context before generation begins. The goal isn’t better prompting but build-time boundaries that prevent structurally invalid code from entering the system.

The Frankenstein factories

The dark factories (as Dan Shapiro calls them) are running. Tokens fly through trycycles, features ship overnight, and codebases are ported before breakfast. The velocity is real. And comprehension debt (a term coined by Addy Osmani) is compounding in silence behind it.

What this era is producing, at scale, deserves its own name: Frankenstein factories. Not a critique of any single approach but a description of a structural condition—generation engines so effective at producing working syntax that they have industrialized the creation of architecturally ungovernable systems. The creature walks out of the laboratory impressive, functional, and alive on delivery day.

The crisis arrives the day someone must govern it. To govern a system means to hold it accountable to its design boundaries—the ability to look at it and reliably say why it works, what is permitted to touch what, and to categorically prevent forbidden state changes before they happen. Victor’s catastrophe was not the act of creation but the absent governing frame.

For prototyping or shipping features fast, unconstrained generation is a powerful tool. It optimizes for velocity, and it delivers. But for enterprise payment systems, insurance underwriting engines, logistics orchestrators, and regulated platforms, the question is not “Does the code ship?” but “Who is liable when it does the wrong thing?” Here, automating the word “YES” to every feature request does not solve the problem. It industrializes it.

Consider a standard Jira ticket: “Add an email notification after a successful payment.”

A junior developer might attempt to wedge the email-sending logic directly into the PaymentProcessor class. A senior architect catches this in code review: “No. Fire a PaymentSuccessEvent to the message bus.” That human friction—the architectural “No”—keeps the system maintainable.

Unconstrained AI agents lack this assertiveness. By default, they are the ultimate yes-men.

Hand that same ticket to a standard coding agent and it will not argue about bounded contexts. It will burn tokens until it produces 300 lines of syntactically perfect code, import an SMTP library directly into the core of your billing domain, and submit a pull request. The tests will pass; conventional feature tests make no assertion about bounded contexts. The CI pipeline will go green. And structurally, the system is now a disaster.

This happens not through malice but because of how agentic loops are built. Without explicit architectural constraints, the system’s emergent behavior is to fulfill immediate user intent. The agent is orchestrated to ship the feature, not to defend the architecture. Comprehension debt is the structural consequence: AI generates syntax faster than human beings can read or govern it. Expecting a probabilistic model to enforce structural integrity on its own is a category error. Without a governing frame, the agent will always take the path of least resistance to a “YES.”

You cannot fix code overproduction by hiring more people to read it nor by running the generation loop faster. The only scalable answer is to build a concrete riverbed before you turn on the water.

If the current era automates the word “YES,” we should automate the word “NO.”

Securing the runtime environment prevents the monster from escaping. But to prevent it from being built in the first place, we need to step back into the IDE and the CI/CD pipeline. We need to govern generation.

The great softening: Shifting risk from build time to runtime

Compilers never guaranteed correct software. You could write catastrophic logically broken systems in C, Java, or any other compiled language. But compilers served a crucial engineering purpose: They deterministically governed a specific layer of structural risk.

By enforcing hard execution constraints—syntax validity, type compatibility, linkage rules, and executable viability—the compiler acted as an automated boundary. It didn’t verify business intent, domain correctness, or architectural quality. What it did was eliminate an entire class of low-level structural failure before execution ever began.

That delegation of risk is one of the quiet triumphs of software engineering. Our discipline has always advanced by mechanizing one class of guarantees so humans can focus on the next layer of abstraction. We automated machine-level structural correctness so engineers could spend their cognitive energy on application logic. Later, we pushed more guarantees upward, into schemas, testing, static analysis, architectural patterns, and operational controls.

Over time, we also deliberately softened certain boundaries in exchange for speed. Dynamic languages, richer runtimes, reflection, and increasingly abstract frameworks all traded deterministic compile-time guarantees for developer velocity and flexibility. The newly exposed risk was absorbed elsewhere: runtime validation, automated testing, observability, and engineering discipline.

Today, with agentic AI, we are softening boundaries again, more radically than ever before.

Natural language has become a high-level control plane for software generation. Arbitrary text increasingly shapes executable behavior. And in that shift, we have blurred one of the oldest boundaries in computing: the separation between data and instructions.

Outside the model, that boundary still exists. Systems enforce permission scopes, schema contracts, sandboxing, and execution policies. But inside the inference context, those protections collapse into the same token stream.

System prompts, retrieved documents, user messages, tool outputs, and external content all flow through the same neural weights. There is no hard privilege boundary between instruction and input. Modern models may resist naive attacks like “Ignore previous instructions,” but they remain vulnerable to indirect injections disguised as legitimate operational context. A malicious instruction embedded in a customer email, a webpage, or a tool response is not processed as passive data. It can become behavioral influence.

Inside the context window, untrusted text can shape control flow. That is the real softening.

We are generating syntax at machine speed, but we have dissolved the structural gate that once constrained how systems were built. The result is a massive shift of risk from build time to runtime. Code that appears structurally sound during generation may violate architectural boundaries, introduce unsafe execution paths, or become behaviorally compromised the moment hostile context enters the loop.

The conclusion is straightforward: The fact that AI-generated code runs is no longer a meaningful proxy for system correctness.

Syntax is abundant. Execution is easy. Structural governance is what is missing.

We outsourced the writing of logic to machines, but we did not build a deterministic boundary that governs what those machines are allowed to generate.

If we want control back, we cannot rely on human code review at machine speed. We must rebuild the build-time gate.

From dependency bloat to tailor-made architecture

For decades, the industry’s default response to complexity was abstraction by accumulation: monolithic frameworks, sprawling dependency trees, and ever-thicker layers of indirection. Importing a 50-megabyte library to avoid repetitive boilerplate was a rational trade-off when developer time and cognitive bandwidth were the scarce resources. For AI agents, that trade-off changes.

This is not an argument against foundational infrastructure. Mature primitives—like SQLAlchemy in Python or Spring Boot in Java—remain essential precisely because their conventions are widely learned and predictable. The problem isn’t abstraction but opacity. When core business logic disappears behind proprietary decorators, internal frameworks, or custom orchestration layers, execution becomes a black box. An agent cannot safely reason about code it cannot trace. It needs direct visibility into causality: what changes state, what enforces invariants, and where responsibilities begin and end. Hidden flow degrades reasoning into guesswork; guesswork silently becomes architectural drift.

At the same time, AI drives the cost of procedural code toward zero. Boilerplate is no longer expensive. Clarity is. The design question shifts from “How much can we abstract away?” to “How much must remain explicit for safe reasoning?” The answer is tailor-made architecture: thin infrastructure, explicit domain logic, hard boundaries, and narrowly scoped components with visible contracts. The value is no longer in how much code you avoid writing but in how clearly the system declares its boundaries.

That same opacity also breaks verification. AI review can catch local defects, risky patterns, and implementation mistakes, but it remains blind to architectural drift and missing business intent unless those constraints are explicitly encoded. After all, if you ask a model to review code generated from the exact same vague Jira ticket, do you actually get verification, or do you just engineer a circular hallucination, where the AI politely revalidates its own blind spots?

Figure 1. Tailor-made architecture gives generated syntax a clear structure without dissolving system boundaries.

The Context Compilation Pattern

The Context Compilation Pattern governs generation in the IDE and the CI/CD pipeline before a single syntactically plausible line ever reaches a human reviewer. If the Decision Intelligence Runtime (DIR) is the vault door that protects execution in production, context compilation is the blueprint that prevents the monster from being built in the lab.

This is not “prompt engineering,” which merely asks a probabilistic model for a better answer. What we need is build-time governance: two layers of defense assembled before the LLM inference is even triggered. The first is structured context injection (assembling the prompt from prioritized artifacts). The second is postgeneration static verification (deterministic AST checks that enforce rules no probabilistic model can override). The prompt structure biases generation toward compliant solutions; the static checks make declared, machine-verifiable boundary violations impossible to merge.

Deterministic build-time governance is not a return to formal software specification (like UML), nor is it merely “prompt engineering disguised as Markdown.” It’s a mechanical constraint on the generation space that makes explicitly declared boundary violations rejectable by design. Context compilation does not eliminate architectural review or replace engineering judgment. Instead, it ensures that the agent operates within a defined riverbed of allowed structural invariants.

Engineering evolves whenever implicit rules become explicit declarations. Application development is now crossing that boundary. The senior engineer’s new job is declarative boundary engineering: explicitly declaring what the system is absolutely forbidden from doing.

The failure is not in the frameworks. The failure is in the process: pointing an unconstrained AI agent at a codebase full of invisible magic and expecting a CI/CD pipeline designed for human-generated code to catch what goes wrong. The answer is to build a compiler for the agent’s context.

The Context Compilation Pattern is the staged pipeline that makes this concrete.

Figure 2. The Context Compilation Pattern pipeline, enforcing build-time constraints through deterministic artifact assembly and dual verification.

Step 1: The context

[truncated for AI cost control]