QUICK REVIEW

[论文解读] gDDIM: Generalized denoising diffusion implicit models

Qinsheng Zhang, Molei Tao|arXiv (Cornell University)|Jun 11, 2022

Advanced Neuroimaging Techniques and Applications被引用 26

一句话总结

论文将 DDIM 推广到非各向同性扩散模型（gDDIM），提出一个有原理的分数参数化与采样方案，大幅提升基于扩散的生成速度，在 CLD 与 BDM 上演示，取得显著加速和具竞争力的 FID 结果。

ABSTRACT

Our goal is to extend the denoising diffusion implicit model (DDIM) to general diffusion models~(DMs) besides isotropic diffusions. Instead of constructing a non-Markov noising process as in the original DDIM, we examine the mechanism of DDIM from a numerical perspective. We discover that the DDIM can be obtained by using some specific approximations of the score when solving the corresponding stochastic differential equation. We present an interpretation of the accelerating effects of DDIM that also explains the advantages of a deterministic sampling scheme over the stochastic one for fast sampling. Building on this insight, we extend DDIM to general DMs, coined generalized DDIM (gDDIM), with a small but delicate modification in parameterizing the score network. We validate gDDIM in two non-isotropic DMs: Blurring diffusion model (BDM) and Critically-damped Langevin diffusion model (CLD). We observe more than 20 times acceleration in BDM. In the CLD, a diffusion model by augmenting the diffusion process with velocity, our algorithm achieves an FID score of 2.26, on CIFAR10, with only 50 number of score function evaluations~(NFEs) and an FID score of 2.86 with only 27 NFEs. Code is available at https://github.com/qsh-zh/gDDIM

研究动机与目标

促进超越各向同性扩散的扩散模型的更快采样。
解释 DDIM 机制，以在低 NFEs 下证明基于常微分方程的采样相对于基于随机微分方程的采样的合理性。
引入具有分数网络重参数化的 gDDIM，用于通用扩散模型。
在非各向同性扩散模型上验证 gDDIM，并量化加速与采样质量。

提出的方法

通过概率流 ODE 和分数行为重新解释 DDIM 以解释加速。
通过设计一个随时间变化的矩阵 K_t 等于 R_t，并满足支配方程，将 DDIM 推广到任意扩散模型。
将分数网络参数化为 s_theta(u,t) = -R_t^T epsilon_theta(u,t)，并推导确定性和随机性 gDDIM 的近似。
开发多步预测-校正方案，在保持精度的同时降低 NFEs。
给出确定性和随机性 gDDIM 公式与相应的理论命题。
在 CLD 和 BDM 上经验性地将 gDDIM 与基于 EMA 的采样器和概率流采样器进行比较。

实验结果

研究问题

RQ1 DDIM 风格的采样在使用适当近似时是否对一般扩散模型也能达到精确（或近似精确）？
RQ2如何在保持采样效率和质量的前提下，将 DDIM 推广到非各向同性或增强扩散过程？
RQ3分数网络参数化和特定的 K_t/R_t 选择是否在不同 DM 上带来显著的加速？
RQ4使用 gDDIM 相对于现有采样器，CLD 和 BDM 的 FID 与 NFEs 的经验增益是多少？

主要发现

扩散模型	采样器	NFE=10	NFE=20	NFE=50	NFE=100	NFE=1000
DDPM†	EM	→100	→100	31.2	12.2	2.64
Prob.Flow, RK45	→100	52.5	6.62	2.63	2.56
2nd Heun††	66.25	6.62	2.65	2.57	2.56
gDDIM	4.17	3.03	2.59	2.56	2.56
BDM	Ancestral sampling	→100	→100	29.8	9.73	2.51
Prob.Flow, RK45	→100	68.2	7.12	2.58	2.46
gDDIM	4.52	2.97	2.49	2.47	2.46
CLD	EM	→100	→100	57.72	13.21	2.39
Prob.Flow, RK45	→100	→100	31.7	4.56	2.25
gDDIM	13.41	3.39	2.26	2.26	2.25

gDDIM 通过对分数网络参数化的微小修改，能够在超越各向同性的扩散模型上实现加速。
确定性 gDDIM 在 CLD 上实现显著的加速和具有竞争力的 FID，例如 CIFAR-10 上在 50 NFEs 时为 2.26 的 FID，27 NFEs 时为 2.86（在 CIFAR-10 上）。
BDM 和 CLD 的实验在相似模型规模下显示出超过 20 倍的加速。
将 K_t = R_t 取自扩散过程的选择，比 L_t 等替代方案产生的 epsilon_theta 路径更平滑、更稳定。
具备适当近似的随机 gDDIM 相较于基于 EM 的方法进一步提升了采样效率。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。