Breaking
BreakingSky SportsDanni Wyatt-Hodge Century Secures Dominant England Win in T20 World Cup Opener· a few seconds agoBreakingSouth China Morning PostLenovo Technology Supports FIFA World Cup Content Distribution Efforts· a few seconds agoBreakingReddit r/worldnewsPutin Acknowledges Ukraine Attacks Impact Russian Economy and Society· a few seconds agoBreakingGameSpotThe Legend of Zelda: Ocarina of Time Remake Characters Examined by GameSpot· a few seconds agoBreakingCBS SportsBrazil to Feature First Foreign Manager at 2026 World Cup· 6 minutes agoBreakingAl JazeeraWorld Cup Fans in Los Angeles Express Excitement Ahead of US Match· 6 minutes agoBreakingAl JazeeraEU Agrees to Launch Accession Process for Ukraine and Moldova· 6 minutes agoBreakingAl JazeeraUAE Reportedly Moving to Release Frozen Iranian Funds· 6 minutes agoBreakingBloomberg MarketsUS Energy Secretary Chris Wright Discusses Energy Strategy at Bloomberg Briefing· 11 minutes agoBreakingBuenos Aires TimesThiago Almada Aims for Argentina World Cup Starting Position· 11 minutes agoBreakingSky SportsDanni Wyatt-Hodge Century Secures Dominant England Win in T20 World Cup Opener· a few seconds agoBreakingSouth China Morning PostLenovo Technology Supports FIFA World Cup Content Distribution Efforts· a few seconds agoBreakingReddit r/worldnewsPutin Acknowledges Ukraine Attacks Impact Russian Economy and Society· a few seconds agoBreakingGameSpotThe Legend of Zelda: Ocarina of Time Remake Characters Examined by GameSpot· a few seconds agoBreakingCBS SportsBrazil to Feature First Foreign Manager at 2026 World Cup· 6 minutes agoBreakingAl JazeeraWorld Cup Fans in Los Angeles Express Excitement Ahead of US Match· 6 minutes agoBreakingAl JazeeraEU Agrees to Launch Accession Process for Ukraine and Moldova· 6 minutes agoBreakingAl JazeeraUAE Reportedly Moving to Release Frozen Iranian Funds· 6 minutes agoBreakingBloomberg MarketsUS Energy Secretary Chris Wright Discusses Energy Strategy at Bloomberg Briefing· 11 minutes agoBreakingBuenos Aires TimesThiago Almada Aims for Argentina World Cup Starting Position· 11 minutes ago
Technology
Source: VentureBeat

Google Unveils DiffusionGemma for Parallel Text Generation and Self-Correction

Google has released DiffusionGemma, an experimental open-source model that applies the diffusion process, typically used in image generation, to text generation at production scale. Built on the Gemma 4 backbone, DiffusionGemma generates blocks of 256 tokens in parallel, refining them iteratively and self-correcting along the way, unlike traditional sequential language models. This approach allows for significantly faster text generation, with Google reporting up to 4x speed improvements on GPUs compared to standard models, particularly for local inference and low-concurrency deployments. While faster, Google acknowledges that its overall output quality is currently lower than standard Gemma 4.

By Fainaron·Jun 12, 2026 (16 hours ago)·2 views
Google Unveils DiffusionGemma for Parallel Text Generation and Self-Correction

Google has introduced DiffusionGemma, an experimental open-source model that leverages the diffusion technique for text generation. This approach mirrors how GenAI image generators refine an entire image in parallel from noise, rather than generating content sequentially.

Traditional language models operate like typewriters, generating one token at a time without the ability to revise previous outputs. DiffusionGemma breaks this pattern by generating a 256-token block in parallel. It begins with random placeholder tokens, then refines the entire block through multiple passes, evaluating and locking in confident positions while re-randomizing and reconsidering uncertain ones. This process enables self-correction and allows every token position to consider all others simultaneously, providing bidirectional context.

DiffusionGemma is built on the Gemma 4 backbone and released under the Apache 2.0 license, making it the first diffusion language model natively supported in the open-source vLLM inference platform. Google states that DiffusionGemma can generate text up to four times faster than standard models on GPUs. Benchmarks by vLLM show the FP8 version reaching 1,008 tokens per second on a single Nvidia H100 and 1,288 on an H200 at batch size 1.

Despite the speed advantages, Google has noted that DiffusionGemma's overall output quality is lower than standard Gemma 4, recommending the latter for applications demanding maximum quality. The model runs as a 26B Mixture of Experts (MoE) model, activating 3.8B parameters during inference, and can fit within 18GB VRAM on consumer hardware like the Nvidia RTX 4090 and 5090 when quantized.

The speed gains are particularly significant for local inference, single-user applications, and low-concurrency serving, where the GPU might otherwise be underutilized. However, for high-throughput cloud serving, where autoregressive models already saturate compute resources, DiffusionGemma's parallel decoding offers diminishing returns. Its bidirectional context and self-correction make it structurally well-suited for constrained generation tasks, such as code infilling, template generation, and problems requiring contextual understanding from the entire sequence.

DiffusionGemma integrates with vLLM via a new ModelState interface, supporting per-request attention switching necessary for its alternating causal and bidirectional attention. This integration aims to support future diffusion models in vLLM. For enterprises, DiffusionGemma provides an alternative path for reducing generation latency on dedicated GPU hardware, especially for specific constrained generation workloads where its architecture offers a structural edge.

According to VentureBeat, DiffusionGemma represents a different generation paradigm compared to speculative decoding, which uses a smaller draft model to guess tokens for a standard target model. It is not merely a decoding trick but a distinct method of text creation. (Source: VentureBeat)

Advertisement

AdSense slot • inline

Source attribution: This article was AI-curated and rewritten by Fainaron from a piece originally published by VentureBeat. Read the original at VentureBeat →

More like this

iOS 27 Introduces New Keyboard Support for Multiple Languages
Technology
a few seconds ago

iOS 27 Introduces New Keyboard Support for Multiple Languages

Apple's upcoming iOS 27 update is set to enhance typing capabilities by integrating support for a new set of keyboards. These additions will include keyboards for Afrikaans, Galician, and various Indigenous languages, expanding linguistic accessibility for users.

9to5Mac
New Research Paper Poses Question: 'Can I Buy Your KV Cache?'
Technology
a few seconds ago

New Research Paper Poses Question: 'Can I Buy Your KV Cache?'

A new research paper titled "Can I Buy Your KV Cache?" has been published on arXiv, an open-access repository for scholarly articles. The paper's title suggests a focus on a concept relevant to the fields of artificial intelligence or computer science, particularly within discussions surrounding computing memory management or large language models. The specific details and findings presented in the article are not detailed in the available information.

Hacker News Frontpage
World of ClaudeCraft MMORPG Revealed
Technology
a few seconds ago

World of ClaudeCraft MMORPG Revealed

World of ClaudeCraft, a new massively multiplayer online role-playing game (MMORPG), has been introduced with a unique stylistic description. The game is reportedly "vibe coded" with Fable 5, suggesting a distinct aesthetic or atmospheric inspiration for its design and development.

Hacker News Frontpage
Breaking
The Legend of Zelda: Ocarina of Time Remake Characters Examined by GameSpot
Technology
a few seconds ago

The Legend of Zelda: Ocarina of Time Remake Characters Examined by GameSpot

Following the announcement of a remake for The Legend of Zelda: Ocarina of Time, GameSpot has published a feature discussing specific characters from the original game. The article highlights figures the author is eager to see reimagined, alongside those they would prefer to avoid in the upcoming release. Discussions extend to how these characters, both beloved and unsettling, might be presented in the remake, which is anticipated to launch later this year, with further reimagining discussions extending to 2026.

GameSpot

By the numbers

Fainaron — live counters

Updated every 30 seconds. Automatically — no human edits.

Total Articles

6.2K

Visitors Today

176

This Month

342

Lifetime Visitors

342

Article Views

1.3K

Pageviews Today

921

Pageviews Lifetime

1.3K

Last 30 Days

342

as of 6/12/2026, 9:19:27 PM