QUICK REVIEW

[论文解读] The Diffusion Duality, Chapter II: $Ψ$-Samplers and Efficient Curriculum

Justin Deschenaux, Caglar Gulcehre|arXiv (Cornell University)|Feb 24, 2026

Generative Adversarial Networks and Image Synthesis被引用 0

一句话总结

引入 Ψ-后验和用于离散扩散的预测-校正采样器，适用于任意先验，提升采样质量并在高斯松弛训练中实现内存高效的课程安排。

ABSTRACT

Uniform-state discrete diffusion models excel at few-step generation and guidance due to their ability to self-correct, making them preferred over autoregressive or Masked diffusion models in these settings. However, their sampling quality plateaus with ancestral samplers as the number of steps increases. We introduce a family of Predictor-Corrector (PC) samplers for discrete diffusion that generalize prior methods and apply to arbitrary noise processes. When paired with uniform-state diffusion, our samplers outperform ancestral sampling on both language and image modeling, achieving lower generative perplexity at matched unigram entropy on OpenWebText and better FID/IS scores on CIFAR10. Crucially, unlike conventional samplers, our PC methods continue to improve with more sampling steps. Taken together, these findings call into question the assumption that Masked diffusion is the inevitable future of diffusion-based language modeling. Beyond sampling, we develop a memory-efficient curriculum for the Gaussian relaxation training phase, reducing training time by 25% and memory by 33% compared to Duo while maintaining comparable perplexity on OpenWebText and LM1B and strong downstream performance. We release code, checkpoints, and a video-tutorial on: https://s-sahoo.com/duo-ch2

研究动机与目标

推动为离散扩散开发更好的采样方法，特别是针对均匀状态扩散模型（USDMs）。
定义一类非马尔可夫后验（Ψ-后验），具有与前向扩散相同的边际分布，但允许重新屏蔽和纠正。
提出结合预测器和校正器步骤的 Ψ-采样器，以在采样步骤增加时提升样本质量。
开发一个内存高效的高斯松弛训练课程，在不牺牲困惑度或下游性能的情况下加速训练速度。

提出的方法

将 Ψ-后验定义为前向过程与反向后验的线性组合，得到非马尔可夫但边际一致的采样器。
将预测-校正（PC）采样扩展到任意先验（MDM 和 USDM），并展示它们如何扩展现有的 PC 方法。
为 Ψ-后验形式化 NELBO，并通过扩散变换算子将其与高斯潜变量联系起来。
引入内存高效的课程安排，在保持困惑度和下游准确度的同时降低训练时间和峰值内存。
描述扩散引导概念，包括针对离散数据的分类器指导和无分类器指导的适配。

The Diffusion Duality, Chapter II: $Ψ$-Samplers and Efficient Curriculum

实验结果

研究问题

RQ1Ψ-采样器是否相对于传统祖先采样和重新屏蔽采样，在任意先验下的离散扩散任务中提升生成质量？
RQ2非马尔可夫的 Ψ-后验是否能够保持与前向扩散相同的边际分布，同时实现重新屏蔽和改进推断？
RQ3所提出的内存高效课程是否在不损害困惑度或下游性能的前提下加速高斯松弛训练？
RQ4与现有方法相比，Ψ-采样器在语言建模和图像建模基准上的表现如何？

主要发现

Ψ-采样器将先前方法推广到任意噪声分布，并且随着采样步骤数量增加而获得更好的生成质量。
Ψ-后验使预测器-校正器步骤成为可能，在保持前向边际一致性的同时纠正错误。
在 USDMs 情况下，Ψ-采样器在语言和图像任务中优于祖先采样，并在高 NFE 情况下缩小与掩码扩散模型的差距。
内存高效的课程显著降低训练时间和峰值内存，同时保持困惑度和下游性能。
在 Ψ-采样器框架内可以应用引导机制（CFG），在离散扩散中引导生成。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。