AI News HubLIVE
In-site rewrite6 min read

Grassroots AI: Beyond the Moonshot

The author critiques the AI industry's obsession with fully autonomous code generation, drawing historical parallels to failed attempts like XML, UML, and LowCode/NoCode. They argue that natural language specs cannot replace code due to inherent ambiguity, non-determinism of LLMs, and the perfect map paradox. The only source of truth remains the code itself.

SourceHacker News AIAuthor: youknownothing

It would appear that, with AI, everyone is trying to go for the moonshot, the magical combination of elements that will make code write itself autonomously, reliably, sustainably and, more than anything, cheaply. This is not a new idea, and it's not working.

Look, I get it, I understand the allure of the moonshot: breaking new ground, creating new paradigms, being the celebrated hero on the front page of Times Magazine. After all, if you're not disrupting an industry, what are you even doing? I can see what a massive win it would be if we managed to create that system that writes code while no one is looking, and I could almost forgive people for trying if it weren't for the countless cautionary tales available in the software lore.

"We will no longer need to look at the code", a history

XML was first published in 1998. At the time it was supposed to be a revolution, it came with all sorts tools: XSL, XPath, XSLT... Documents written with XML were supposed to be composable, and the vision was that documents all over the internet would be connected to each other forming an interconnected information landscape. It was at that time that we started talking about the semantic web. The future was rosy.

The truth is that XML was clunky, nobody really liked it, but it was ok because it wasn't meant for humans, it was meant for machines. You weren't expected to read or write XML documents, you would only deal with graphical interfaces that would interpret and modify the XML content for you. Fast forward a couple of decades and XML has been largely replaced by JSON and YAML, which are more programmer-friendly. It turns out that people do like to read and manipulate files directly.

It was around that time that the first graphical IDEs like KDevelop and Eclipse came up. They had tons of plugins and graphical interfaces that showed multiple views of the project: one panel for Ant commands, another panel for listing the available methods, etc. This, together with the smaller screens and lower resolutions of the time, meant that the code panel was a tiny window in the middle, barely showing 15-20 lines of code; this

was ok because you weren't really expected to look at the code, you only interacted with graphical elements. Creating a method implied File -> Add -> Method (or something to that effect), and you'd get a dialogue box where you entered name, parameters, at al. Adding a dependency to Maven also had its own dialogue box. You didn't interact with the code, that was for machines. You interacted with a UI.

An early version of Eclipse IDE. Not a lot of space for code; not that it mattered.

Today, IDEs make the code panel as big as possible, or even show you two or more code panels in parallel. There are even minimalist options like Sublime that get rid of all the graphical elements so you can focus entirely and solely on the code.

That was also the time when Poseidon came up. UML was all the rage, OOP was taking off, and Poseidon combined all of that with a renewed promise: you draw the entities that compound your program using UML, you indicate the relationships and the attributes, and I generate the code. Because you are a highly-valued engineer and you shouldn't waste your time and talents writing code. No, sir.

The God of the Seas was apparently very good at programming.

Needless to say, that didn't stick. Their defenders would say things like "a junior engineer reads code, a senior engineer reads diagrams". They'd compare themselves to architects who handle blueprints, not bricks. But, in the end, we went back to the code.

ETL, LowCode, NoCode... I could go on forever. The dream has always been the same: we will no longer need to look at the code. Every time doomsayers like me would mutter "we've tried that before". Every time we'd hear back "but this time is different". And every time it became yet another broken dream. And every time we went back to the code.

In fact, what history has taught us is that not only we wouldn't give up on manipulating code directly, we doubled down! We took activities that were traditionally managed through configurations, settings, and UIs and we turned them into code too, giving birth to Infrastructure as Code, GitOps and, eventually, Everything as Code.

The code is the spec

In today's "but this time is different" camp, the theory is that you write a detailed spec of what you need and AI creates the code for you. So much that people are comparing AI specs to compilers, saying that these are just another layer of abstraction. They claim that detractors of AI-generated code are just like the early programmers who didn't like high-level languages and preferred to continue using assembly. However, this analogy breaks easily upon scrutiny.

A programming language is typically described by a BNF grammar that can be used to verify whether it is syntactically correct. It also has a compiler that transforms the source language into a target language with mathematical precission. The use of the word "mathematical" is no exaggeration here: a compiler truly can be described as a mathematical function that transforms an input into an output in a well-defined and predictable way. In fact, this quality has threatened the very concept of software patents since its inception: all software can be rewritten as a mathematical formula using lambda calculus, and mathematical formulae cannot be patented. But I digress, the point here is that, when you compile source code, you know exactly what you're going to get (or at least what you are supposed to get).

The same is not true of AI; or, rather, of LLMs. Natural language cannot be formally defined. Natural language is messy, ambiguous, and, worst of all, its meaning changes with time, location, and culture. Or, as Wittgenstein put it, "the meaning of a word is its use in the language". That's why lawyers have such a hard time drafting agreements that won't come back to bite them. In theory, you should be able to write a spec in natural language that indicates what you want. In practice, that spec has the potential to be misinterpreted, or to miss implicit knowledge, or to be based on uncommunicated assumptions. AI, obliging as it is, will try to produce a program that matches your spec, but there will be inevitable hallucinations, omissions, and deviations. What's more, taking into account that LLMs are non-deterministic by design, invoke it twice in a row with the same input and you'll have different outputs. The overarching behaviour may be preserved, but tiny drifts will be introduced here and there. And here is where human expectations clash: nobody likes when buttons are randomly moved about, even if they function in exactly the same way.

The solution is, allegedly, very simple: make sure that your spec contains all the relevant detail, all the requirements, all the edge cases, all the knowledge. Make sure that it's written in a way that leaves no room for ambiguity. Make sure that it explicitly discards any potential misintepretations. Make sure that all stakeholders have reviewed and approved it. In summary, make sure that your spec is well-defined. Sure, and a side of fries, please. The well-defined spec has been the holy grail ever since software development started as an industry. And it has never happened. The well-defined spec is the software equivalent to the spherical cow.

But let's play along and assume that we can indeed create a well-defined spec that contains the minutiae of the project, now we find ourselves facing the perfect map paradox: the only way to specify without room for error everything that the code has to do is to specify it in as much detail as the code itself. The map becomes as big as the territory that it's trying to describe. The spec becomes redundant because it no longer is an abstraction of the code, the spec has turned into a copy of the code and, for that, we already had the actual code.

Ultimately, the code is the only source of truth.

After all, it's a matter of basic entropy, perfectly encapsulated in Plato's Theory of Forms: you can envision an abstraction from the particular, but creating the particular from the abstraction will always be imperfect.

Creating the spec with AI

In predictable "AI all the things" fashion, defenders of the newfangled Spec-Driven Development argue that you can embed AI in all your business interactions so the specs write themselves automatically (sounds familiar?). Do you have a meeting? Enable AI note-taking so the conclusions of the meeting are automatically available. Did you record a user research session? Ask AI to extract the main points. Do you now have more documents than you could possibly deal with in a lifetime? Use AI to analyse the lot and distill the main points. Is this information sparse, contradictory, and potentially wrong? AI will fix that for you.

It is at this point where we need to learn from those who have been dealing with potentially unreliable data for decades: the Intelligence Community, aka CIA, NSA, SIS, and their friends. First of all, we need to distinguish between three concepts: data, information, and intelligence.

Data: facts, with no judgement or analysis attached to them. Things like "Bernice said X", or "200 toys were sold in March".

Information: structured data, organised and filtered in such a way that it becomes relevant in a particular context or for a particular purpose. Things like "toy sales have been in steady decline".

Intelligence: the analysis of information from the perspective of goals, values, and environment so as to generate actionable directives. Things like "competition is eating into our marketshare thanks to their more effective marketing".

AI can easily give you a lot of superficial data, most of it correct. AI can try to help generate information from the data, but this will often miss important nuances that are not written anywhere because they belong to the collective, unspoken knowledge (common assumptions, institutional inertia, cultural cues, etc.). If you try to use AI again to go to the next step, all the tiny little errors will compound to the point of creating intelligence that seems superficially fine, but that will have enough inaccuracies so as to make it unusable. Kind of like making a photocopy of a photocopy of a photocopy.

"O ye of little faith! That has a simple solution, you just need people to review and amend, if necessary, all the content that AI generates. So after a meeting, people review the minutes and approves them. Then, after AI has consolidated them, people review the summary and approves it. Then, when the spec is written..."

Go on, I'll let you finish, you can write your own rebuttal when you're done.

But dark factories!

Ah, yes, dark factories. In case you haven't heard, dark factories (or lights-off factories) are factories that are so incredibly automated that they don't even need humans any more. And, since they don't need humans, they don't even need to keep the lights on, so they run dark. So very cool. This idea is getting some people's juices running with the next sci-fi analogy: AI is our dark factory, you set a bunch of agents loose and they create code without any human intervention. "I built a crypto trading platform while I was walking my dog". That kind of thing.

Let's take a couple of steps back. First of all, there is the matter of time scales. The industrial revolution started almost 250 years ago, and only now we're beginning to talk about fully automated factories where there are no humans involved. The whole software industry hasn't been here for more than maybe 70 years, and AI for less than that; it is excessively optimistic to think that we're going to turn a highly manual job into a fully automated one in such a short time.

Second, there is a fundamental difference between manufactoring and software: in manufactoring, you produce multiple identical copies of the sam

[truncated for AI cost control]