2026-06-07 18:36 UTCIn-site rewrite6 min readUpdated: 2026-06-30 13:03 UTC

The AI coding optimized for the part of engineering that hurt the least

AI coding tools excel at writing new code but fail during operational incidents like 3am production outages, where knowledge context is missing. Engineers spend 84% of their time on non-coding tasks, primarily context gathering. The article argues for treating team knowledge as infrastructure and suggests capturing rationale, putting constraints in the workflow, and closing the feedback loop from incidents.

SourceHacker News AIAuthor: srbsa

← All posts

Most tools being built show up when you're writing new code. None of them show up when you're trying to save production at 3am. The gap is not an accident, and it is about to become the most expensive thing in your engineering org.

It's 3.14 am, and Daniel's (name changed) phone is doing something unusual; instead of a polite buzz of a Slack mention, it is the incessant alarm of a page that has already gone unacknowledged once. He's the senior on call this week, as he is a week every month.

Checkout latency is through the roof with the error rate rising. He's awake, laptop open, and the clock that matters — the one that turns into refunded orders and a Monday-morning conversation with the VP — has already started.

Here is what the next ninety minutes actually look like. Not the sanitised retro version, the real one.

He starts where everyone does: "what changed?" The question sounds simple but is not, as the answer is scattered across six tools that don't talk to each other. He checks the deployment dashboard; three services shipped in the last few hours. He scrolls the team Slack, two hundred messages deep, looking for whether anyone mentioned touching the payments path. He pulls up the company's internal search (they pay for a good one) and types "checkout retry timeout", which returns a wall of documents — a runbook from 2023, two design docs, and a half-finished Confluence page. The search technically worked, but it did not answer the question. He still opens each one and glances through, decides if it is current, and scribbles it as he opens the next. He is manually synthesising by hand at 3 am with about three hours of sleep.

Thirty minutes in, he is yet to fix anything. He is assembling a picture of his own system.

He's not slow; this is just the job. Analysis of incident response reveals that 40% to 60% of the time is lost reconstructing context across tools, with engineers burning 15 to 25 minutes gathering scattered information before investigation begins. A typical outage involves 5 minutes on Slack, 10 minutes checking recent commits, 5 minutes reviewing dashboards, and only then — 25 minutes in — finally understanding a specific deployment caused the spike. The technical fix, once you know what to do, is often trivial. Getting to the point where you know what to do is where the night goes.

Eventually Daniel narrows it through time-series shape — latency creeping in a sawtooth that smells like retries stacking up. Now he needs the data to confirm it, which is not just a different tool but a different mode of work entirely: he is now writing ad hoc queries against the metrics backend, massaging time windows, and building the throwaway script that pulls the numbers he needs, because the dashboard that nobody built in advance doesn't exist. Terminal, browser, metrics tool, back to terminal. Every tool switch breaks the flow and forces him to reorient.

He finds it. Someone raised the retry cap in one of those afternoon deploys. There was a reason the cap was where it was — there always is — but it lives in a code-review comment from eight months ago and in the head of the staff engineer who is, mercifully for her, asleep. Daniel makes the call, ships the fix, watches the graph normalise, and posts the all-clear in the channel.

Then comes the part he hates the most.

He has to write it all down. The postmortem — the timeline, root cause, the decisions he made, and the follow-ups — so that the next person who hits this should not have to re-run his entire night. At 5 am, he also knows that he is going to write a thin, joyless version of it because he is exhausted, and three months from now someone will hit something adjacent and will have to start their own 3 am from scratch. How are they going to find this anyway?

Here's the part worth sitting with.

At no point during that entire night did any of the AI tooling his company had invested in have anything to offer him.

The coding agents that can scaffold a service in seconds? Silent. The coworker agents drafting PRDs and summarising meetings? Nowhere.

The entire promise of "AI is transforming software development" was absent during the ninety minutes when Daniel's job was the hardest, the stakes were the highest, and the company was actively losing money. The agents are present for the joyful part of writing happy-path code and are absent for the operational reality that senior engineers spend most of their lives in.

That should bother you more than it probably does.

This was never a coding problem

Let's replay Daniel's night and ask at each step, "What was actually missing?"

It wasn't the ability to write code. He can write code in his sleep, nearly literally. What was missing, every single time, was knowledge that existed somewhere in the organisation but not where he needed it, when he needed it.

What changed in the last four hours? That existed in the deploy logs and the PRs. What the retry cap was set to, and why? That existed in a review thread. Whether anyone had seen this shape before? That existed in a past incident that no one could surface. The fix existed the moment he reconstructed enough context to see it.

Every step of the incident was a knowledge problem wearing incident's clothes. The search tool didn't fail because it was bad; it failed because finding a pile of documents to read is not the same as answering a question. At 3 am the synthesis tax is paid in the most expensive currency: a senior engineer's attention, under pressure, against the clock.

This is the quiet truth about senior engineers that no org chart shows: the most experienced people function as the company's living knowledge layer. The reason the team "asks Daniel" is that Daniel has, in his head, the mapped, reconciled, current, sourced understanding of how things actually work and why. This is the stuff that was decided in a meeting, argued out in a review, learnt the hard way during a previous outage, and never written down anywhere a machine or a newcomer could reach. He is the knowledge base. Which is only wonderful right up until he's asleep, or on vacation, or has left the company and taken the only copy with him.

This is not a small or exotic problem — just a median engineering day. Multiple studies converge on the finding that developers only spend ~16% of their time actually writing application code, with the rest going to operational and supportive work: monitoring, maintenance, coordination, and the endless hunt for context. Around 64% of them report spending more than 30 minutes a day just searching for stuff, and a third spend over an hour. The real bottleneck was never writing code — it was to know.

AI got dramatically better at the 16% while the other 84% barely moved

This is where it gets complicated for anyone betting their roadmap on "AI will fix engineering velocity".

The coding harnesses have gotten really good very fast, aside from the occasional stub and hallucinated API. In a controlled trial, experienced developers working in codebases they knew well were measurably slower when using AI tools — and, more tellingly, believed that they had been faster. The people running the study and developers themselves had both predicted a speedup. The reality was contrary, and the participants just couldn't feel it.

You should hold that finding honestly, with caveats — a small study on matured codebases with early-2025 tooling. The gap has likely narrowed as models, coding harnesses, and workflows have matured. The point is the perception gap: engineers will feel faster whether or not they really are, which means the feeling is not a metric you can run an engineering org on.

Stepping back from coding, the larger pattern shows up. Enterprise analysis of why agent-assisted work fails keeps arriving at the same culprit: not raw model capability, but missing context and planning — the agent didn't know the constraint, the knowledge of how the org functions, or the thing that wasn't explicitly written down or prompted. The capability curve grew vertically, but the curve for team-specific operational knowledge didn't move at all. The retry rationale, the co-deploys that must be done together, the operational gotchas — they still live only in Daniel's mind. Trapped in the minds of people and scattered across tools.

The gains therefore stall at a very specific place: where knowledge lives in someone's head. Which is exactly where Daniel was standing at 3:14 am.

It gets worse as you adopt more agents

The instinct is: "so we'll point the agent at the operational work too." Good instinct. But notice what happens to the knowledge gap as you add agents.

Every agent is a brand-new hire with no memory. It has never sat in your architecture review, it doesn't know that payments and inventory must deploy together, or that the "temporary" rate limiter from last spring is now load-bearing. So every agent, every time, re-asks the same questions that new engineers ask — and the answers still live in the minds of your seniors. You haven't reduced the load on Daniel for context; you have multiplied it by the number of things asking him for context and pointed them all at the same undocumented bottleneck.

Meanwhile, the underlying decay continues. The tribal understanding erodes with every departure and every reorg. New engineers — humans and agents alike — ramp slowly because onboarding is the transfer of exactly this unwritten knowledge, and that transfer doesn't scale by hiring more or spinning up more agents. With operational load having gone up by 30% for the first time in five years, the load is going up while the knowledge to handle it gets thinner and more thinly spread.

The more agentic the engineering org becomes, the more unwritten, unreconciled, person-trapped knowledge becomes the bottleneck for everything. You can buy more capability, but you cannot buy back the context your team never wrote down.

What good teams are starting to do about it

The teams getting ahead are treating their working knowledge as a real infrastructure layer — not a byproduct that happens to accumulate in the minds of people. Here are a few moves:

Capture the why, not just the what. A merged PR records what changed. The reasoning of why the retry cap is and must stay at 2 usually evaporates into a review thread and then human memory. The decisions that cause incidents are the ones whose rationale was never written where the next person would look.

Put the constraints where the work happens. A rule that lives in a Confluence page nobody opens during an incident may well not exist. Knowledge has to surface at the moment of action — when the diff is being written, when the change is being reviewed, when the page is firing — not in a wiki that you'd need to remember to consult.

Treat context as the first-class artefact. The same reconciled answer that a senior would give — current, sourced, and specific to your system — is what teams need to build and maintain deliberately.

Close the loop from incidents back to knowledge. The postmortem Daniel hates writing at 5 am exists to feed exactly this layer. The tragedy is that it's manual tax paid by an exhausted human, when most of what it needs to capture — timeline, deploys, decisions, threads — already exists in the system that watched it happen.

The last point is the tell. The reason on-call is a knowledge problem is also the reason it is solvable — everything the next responder needs was already produced and recorded somewhere. It just needed to be reconciled into one current answer at the moment it mattered.

The shape of the fix

Picture Daniel's night with one thing changed.

The page fires. Before Daniel opens his laptop, an agent queries: "What changed in checkout in the last 4 hours?" and "What are the hard constraints in checkout?" Before he opens the six tools, the reconciled answers are waiting for him: the three deploys, th

[truncated for AI cost control]