news

DiffusionGemma

DiffusionGemma is an experimental open model from Google DeepMind in the Gemma family that generates text by diffusion instead of autoregression. Rather than predicting one token at a time, it denoises whole spans in parallel, built on Gemma 4 and Gemini Diffusion research. The payoff is speed: up t

Sebastien Dubois

20 Jun 2026 — 1 min read

Canonical version: DiffusionGemma.

DiffusionGemma is an experimental open model from Google DeepMind in the Gemma family that generates text by diffusion instead of autoregression. Rather than predicting one token at a time, it denoises whole spans in parallel, built on Gemma 4 and Gemini Diffusion research. The payoff is speed: up to ~4× faster output, exceeding 1,000 tokens/second on a single NVIDIA H100.

How diffusion text generation differs

Standard Large Language Models (LLMs) are autoregressive: strictly sequential, token-by-token, bottlenecked on memory bandwidth. DiffusionGemma uses discrete text diffusion with bi-directional attention to generate many tokens per forward pass (a "canvas" of 256), then iteratively refines them, enabling self-correction and better global consistency, and shifting the bottleneck from memory bandwidth to raw compute.

Architecture

~26B total parameters (25.2B), 3.8B active: AI Mixture of Experts (MoE) (8 of 128 experts active + 1 shared)
Encoder-decoder: an autoregressive encoder caches the prompt context, paired with a diffusion decoder
Up to 256K token Context Window; 262K vocabulary; sliding window 1024
15–20 tokens generated per forward pass (>1100 tok/s on H100 in FP8)
Multimodal input (text, image, video → text); ~550M vision params; built-in thinking mode
Supports NVIDIA NVFP4 (4-bit float) on Blackwell GPUs

Performance (instruction-tuned)

MMLU Pro: 77.6% · GPQA Diamond: 73.2% · LiveCodeBench v6: 69.1% · MATH-Vision: 70.5%

Availability

Apache 2.0 License; an open-weight release
On HuggingFace (google/diffusiongemma-26B-A4B-it), Kaggle, and Google Vertex AI Model Garden

References

About Sébastien

I'm Sébastien Dubois, and I'm on a mission to help knowledge workers escape information overload. After 20+ years in IT and seeing too many brilliant minds drowning in digital chaos, I've decided to help people build systems that actually work. Through the Knowii Community, my courses, products & services and my Website/Newsletter, I share practical and battle-tested systems.

I write about Knowledge Work, Personal Knowledge Management, Note-taking, Lifelong Learning, Personal Organization, Productivity, and more. I also craft lovely digital products and tools.

If you want to follow my work, then become a member and join our community.

Ready to get to the next level?

If you're tired of information overwhelm and ready to build a reliable knowledge system:

📚 KM for Beginners — 10+ hours of structured video lessons
🚀 Obsidian Starter Kit — Ready-made vault with 40+ templates
💼 Knowledge Worker Kit — Complete guides + lifetime community
🦉 1-on-1 Coaching — Personalized guidance
🎯 Join Knowii — Community + ALL courses & tools