2026-06-17站内改写1 min readUpdated: 2026-06-17

The Sequence AI of the Week #878: Inside Google Deepmind's First Real Crack in Next-Token Generation

Google DeepMind has released DiffusionGemma, a text-diffusion model that challenges traditional transformer architectures by not generating text left-to-right token by token.

SourceTheSequenceAuthor: Jesus Rodriguez

As we wrap up our series about alternatives to transformer architectures, Google DeepMind just released one of the most impressive models in this category. DiffusionGemma is a text-diffusion model that challenges the conventional transfromer models. Today, we would like to deep dive into the specifics of this model.

Most language models write like a typewriter. They place one token after another, left to right, never revisiting the characters already stamped onto the page. This architecture has carried the entire modern LLM era: GPT-style chatbots, coding copilots, reasoning models, agent frameworks, enterprise assistants. The model predicts the next token, appends it, updates its state, and repeats.

Google’s new DiffusionGemma asks a deceptively simple question: what if text generation did not have to work that way?

Let’s dive in.