AI News HubLIVE
In-site rewrite2 min read

The AI Mirage or Why I Think the Hype Can't Sustain Itself

The article argues that AI, particularly LLMs, cannot achieve 100% reliability, necessitating human verification that undermines efficiency gains. Drawing parallels to self-driving cars and code generation, the author contends that the hype and sky-high valuations are unjustified, as the bottleneck remains human oversight.

SourceHacker News AIAuthor: louwrentius

If you zoom out enough, everything is a black box where we don't know about the inner workings, but we can learn about the black box by putting stuff in and observing the output.

Let's pretend that a large language model or LLM is such a black box. If we observe a LLM, we learn - amongst other things - that the output is 'correct' 99%1 of the time.

This is a very fundamental and important observation. Computers are known for their correctness, for their reliability. We know that if we run a function with input A, we get output X, no matter what. Yes, packets can be lost, memory can go corrupt but those things go wrong in a predictable way. We check and retransmit packets, we use ECC memory. And data is formatted in such a way that we can detect lost packets or corrupted memory in the first place. Our entire world relies on it.

Imagine that a systems can't be trusted. That 1+1 isn't always 2. Only 99% of the time. How valuable can such a system be? That probably depends on the circumstances, but we know one thing for sure: we can't trust the output, it must be checked. It doesn't matter how, but it must be checked for correctness and that requires a person.

We see it with self-driving cars. It's really impressive what is possible. But they aren't true self-driving. A person needs to be sitting at the wheel, keeping their attention focussed on traffic as if they would be driving themselves, so they can intervene when the AI makes an inevitable mistake.

Although this post isn't about self-driving cars, we know that humans have a tenancy to get distracted and bored if we aren't actively engaged. We can argue that either we people keep driving ourselves, or we need 100% reliability, so we can remove the steering wheel and read a book while the car drives itself. Only 100% is good-enough, 99% doesn't cut it. What did we actually solve if I'm still required to 'drive' the car because of the 1% chance things go wrong?

In the case of an LLM, how much time is saved by letting a person check the output of an LLM? Can that saved time justify the actual operating cost of the LLM, as opposed to the subsidized cost AI vendors are currently charging?

In case of writing code, an LLM can probably create more functionality and features in a week a team of 100 engineers can validate in a year. So no matter what how fast the LLMs really are, the people are the bottleneck we can't circumvent.

That is, if we care about correctness, about quality, about stability and so on. But if we don't, why do something in the first place, regardless of an LLM being involved or not.

So this is why I think it's impossible for the AI hype to come through on the sky high promises their absurd valuations suggest. I'm not saying that AI may not be valuable2 but probably orders of magnitude less valuable than people want us to believe.

I admit, this isn't an original idea, people have voiced this idea in different, maybe more succinct ways. Yet I think it is worth repeating.

P.S.

I also don't understand why organizations would make LLMs a very integral part of their processes, only to discover that models are changed and tweaked, resulting in wildly unpredictable and different output.

Sometimes, when the light hits an LLM just right and you squint your eyes, it takes the shape of a crypto currency. At least that's what I see.

the number is based on nothing, in practice it's probably worse and the actual value is not important except to note that it's not 100%. ↩

I'm ignoring the energy 'waste', pollution, the IP theft, the copyright infringement, the AI-induced self-harm. On and on the list goes. ↩