2026-07-03 16:29 UTCIn-site rewrite6 min readUpdated: 2026-07-03 16:37 UTC

The Information Theory Behind Why AI Writing Sucks

This article explains why AI-generated text lacks the unique voice of human authors, using information theory concepts like probability distributions and KL divergence. Alignment techniques like RLHF push models toward a low-variance 'Annotator Consensus Dialect', and prompting or temperature adjustments fail to reproduce human stylistic irregularity.

SourceHacker News AIAuthor: malshe

Article intelligence

EngineersAdvanced

Key points

Human authorial voice is a specific probability distribution; alignment techniques like RLHF push models toward a low-variance consensus dialect.
Prompting for style shifts the mean but cannot emulate the structured irregularity of human writing.
Temperature scaling and other decoding strategies add arbitrary noise rather than conditional deviations.
Alignment does not erase latent diversity, but current inference-time steering is limited, making AI writing detectable.

Why it matters

This matters because human authorial voice is a specific probability distribution; alignment techniques like RLHF push models toward a low-variance consensus dialect.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

This panel is AI-generated and reviewed for accuracy.

AI Detector

Browser Extension

API

LMS Integrations

Plagiarism Checker

Multilingual AI Detection

Pricing

Contact Sales

Table of contents

Voice as a probability distribution

The RLHF trap and the "Annotator Consensus Dialect"

The illusion of camouflage (why prompting for style fails)

The failure of temperature and friends

So what?

Disclosure: An AI language model was used during the editing process to draft technical descriptions and suggest structural and prose improvements. Several suggestions from AI were used in the final version of the article.

I have read an embarrassingly large amount of fiction, especially science fiction. I also use every flagship AI model that is released for my software engineering job.

Those two sets of experiences left me with a gnawing feeling that AI has a shockingly uniform "voice" when compared to a high-functioning human author.

Anyone with a love for literature has felt what I'm talking about. I've read stories by about five thousand different authors, but I honestly think that even if you've only read a half-dozen authors you'll notice that each author occupies their own stylistic space.

Compared to the unique voices of human writers, AI writing sounds remarkably uniform. It turns out that there's a good reason for this, and it has to do with information theory.

Voice as a probability distribution

A unique authorial "voice" is not random, and it is not average. It is a specific probability distribution — let's call it P_author. When an author writes, they sample from a highly idiosyncratic process. They have specific conditional probabilities for how they implement concepts, pacing, vocabulary, and other stylistic tools.

What makes a voice recognizable are the low-frequency, high-impact choices that an author makes consistently (the long tail of the distribution). If I say "Ted Chiang", you'll immediately think about how syntactically plain but semantically dense his sentences are (it's a style I admire, but as this parenthetical demonstrates, I cannot emulate). If I say "Ursula K. Le Guin", you'll think about how she can be so clear and grounded but still give a lyrical feel — I can't really describe her style well, but readers of Le Guin know what I mean.

Ultimately what I'm getting at is that the right way to measure how "AI-like" a text sounds is not to check whether it's predictable in general — most competent writing is somewhat predictable — but to measure the KL divergence between the model's output distribution and a specific author's distribution: D_KL(P_author || Q_model). For those unfamiliar with KL divergence, this measures how badly the model's distribution fails to cover the author's choices (to be specific, it's measuring the expected extra information cost of encoding samples from P using a code optimized for Q). When this divergence is large and structured, you hear a voice.

The RLHF trap and the "Annotator Consensus Dialect"

During pre-training, a large language model generates a map of a generalized distribution of human text. This base distribution, Q_base, is enormously wide. In its latent space it contains the capacity to approximate almost any P_author.

The trap I mention begins with alignment. To make the model safe and useful, labs apply techniques like Reinforcement Learning from Human Feedback (RLHF) and others. The details vary, but the bottom line is that the model is optimized to produce outputs that score well against a reward signal derived from human (or AI) preferences.

This does not push the model toward the statistical average of English. It pushes it toward something with a different probability distribution — let's call this the Annotator Consensus Dialect.

The mechanism to get there is this: when the judges (gig workers hired to evaluate outputs or experts or whoever) evaluate outputs, idiosyncratic writing creates high variance in ratings. My style of writing might score 5/5 from one rater and 2/5 from another. But a sterile, symmetrical, heavily hedged response might score 4/5 across the board. The optimization algorithm dictates that the safest way to maximize expected reward is to collapse variance. It is the conversational equivalent of hotel lobby decor.

You might say "Joe, this isn't a fair characterization! Newer alignment techniques are explicitly designed to preserve diversity!". While this is true, the newer methods still optimize for a notion of "preferred" output, which still penalizes high-variance risk-taking relative to safe, broadly acceptable prose.

This is a testable claim (I haven't tested it, but it's testable). If you measured the KL divergence between aligned model outputs and a corpus of, say, corporate communications versus literary fiction, my prediction is that the model's distribution would sit far closer to the corporate center. To my knowledge, no one has published this exact measurement, but the optimization math strongly predicts it.

The illusion of camouflage (why prompting for style fails)

I know what you're thinking: yeah, but you can prompt the model out of this dialect. "Write in the style of a 1920s hard-boiled detective" or whatever (part of me wants to see what this article would read like if I asked a model to rewrite it as Lupe Fiasco lyrics). This does produce text that looks different from the Annotator Consensus Dialect, but it still feels suspiciously uniform.

This is because there is a mathematical difference between shifting a distribution's mean and reproducing its variance structure.

When you ask a model to mimic an author, it shifts its center of mass. It calculates the statistical average of the target's vocabulary, sentence structure, and other style implementations, and moves there. But it applies the same variance-collapsed mechanics we've been discussing to this new location.

Human style relies on structured irregularity. An author has a baseline rhythm, but they break it intentionally by doing things like stumbling into a fragment, dropping an uncharacteristic verb, or tangling a sentence for emotional effect. Computational stylometry has tools for measuring this: Hurst exponents on sentence-length time series can reveal long-range dependencies in human writing that AI text lacks. Human authors modulate their lexical diversity in ways that models don't.

All this is to say that when you ask for writing in a particular style, the model captures the tropes of the target style but smooths out all the burstiness. It generates a caricature of what you asked for.

The failure of temperature and friends

If the AI's distribution is too narrow, why can't we just widen it?

The most common approach is temperature scaling. When you increase the temperature T, you divide the model's raw logits by T before computing probabilities, which flattens the entire distribution and forces the model to pick less likely words. But it does this blindly. A human author's eccentricity is highly conditional. Humans break the rules in very specific, consistent ways, whereas temperature scaling just introduces stochastic noise.

Hopefully this is pretty intuitively obvious — ultimately increasing temperature just transitions you from "suspiciously smooth" to "suspiciously random" without passing through human at all.

I know there are more sophisticated decoding strategies. Top-p (nucleus) sampling, top-k filtering, repetition penalties, and classifier-free guidance all attempt more targeted redistribution. They do help at the margins, but none of them solve the fundamental problem that these are inference-time interventions operating on a model whose whole operating philosophy (if you can call it that) was shaped during alignment.

There is also an important nuance here that one of my friends recently pointed out to me: alignment does not erase the base model's latent capacity for stylistic variance. The pre-trained weights still encode most of the richness of Q_base, as long as you've got enough weights. There are emerging inference-time steering techniques like Representation Engineering that can partially recover the suppressed variance by reaching into the underlying latent space. These are research areas though and not something available in the public AI products.

Similarly, long-context in-context learning can also provide slightly better results, but attention mechanisms attenuate when context gets big enough (and you will start to drift back to the uniform distribution as the context grows).

So what?

The main takeaway here is that design choices that go into RLHF-adjacent techniques are going to force these AI "voices" to be detectable far longer than anyone wants to admit.

Also, it's useful to think of an author's style as a specific high-dimensional probability distribution, and I'd challenge you to try and identify some of the KL divergence yourself the next time you're reading your favorite author. Where does the author's voice come from? It's a fun exercise that might increase your enjoyment of the text, and the difficult process of practicing and internalizing new knowledge is a good one to perform in these days of LLM-induced skill atrophy.

Joe StechGuest Writer

Joe Stech is the editor of the yearly anthology series Think Weirder: The Year's Best Science Fiction Ideas. He also works as a Principal Solutions Architect on developer and platform enablement at Arm. Views expressed here are his own.