QUICK REVIEW

[论文解读] PTQD: Accurate Post-Training Quantization for Diffusion Models

Yefei He, Luping Liu|arXiv (Cornell University)|May 18, 2023

Machine Learning in Materials Science被引用 13

一句话总结

PTQD 引入一个统一的后训练量化框架，用于扩散模型，将量化噪声分解为相关部分和不相关部分，校正它们，校准方差日程，并使用逐步感知混合精度来保持信噪比，实现在几乎全精度质量的同时获得显著的位操作加速。

ABSTRACT

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion models less practical for low-latency and scalable real-world applications. Post-training quantization (PTQ) of diffusion models can significantly reduce the model size and accelerate the sampling process without re-training. Nonetheless, applying existing PTQ methods directly to low-bit diffusion models can significantly impair the quality of generated samples. Specifically, for each denoising step, quantization noise leads to deviations in the estimated mean and mismatches with the predetermined variance schedule. As the sampling process proceeds, the quantization noise may accumulate, resulting in a low signal-to-noise ratio (SNR) during the later denoising steps. To address these challenges, we propose a unified formulation for the quantization noise and diffusion perturbed noise in the quantized denoising process. Specifically, we first disentangle the quantization noise into its correlated and residual uncorrelated parts regarding its full-precision counterpart. The correlated part can be easily corrected by estimating the correlation coefficient. For the uncorrelated part, we subtract the bias from the quantized results to correct the mean deviation and calibrate the denoising variance schedule to absorb the excess variance resulting from quantization. Moreover, we introduce a mixed-precision scheme for selecting the optimal bitwidth for each denoising step. Extensive experiments demonstrate that our method outperforms previous post-training quantized diffusion models, with only a 0.06 increase in FID score compared to full-precision LDM-4 on ImageNet 256x256, while saving 19.9x bit operations. Code is available at https://github.com/ziplab/PTQD.

研究动机与目标

激发对扩散模型进行后训练量化（PTQ）的需求，以在不重新训练的情况下降低内存和计算量。
提出一个统一的噪声模型，将量化噪声与扩散去噪扰动分离。
在采样过程中为相关和不相关量化噪声提供纠正机制。
引入逐步感知的混合精度策略，以在各去噪步骤中保持高信噪比。

提出的方法

使用带有显式噪声表示的均匀量化进行模型量化。
将量化噪声分离为相关部分 k*epsilon_theta 和不相关残差 (Delta epsilon_theta')，如式 (7) 所示。
通过将相关分量输出除以 1+k 进行相关噪声纠正（Eq. (9)）。
通过偏差纠正（BC）和方差日程校准（VSC）实现不相关噪声纠正（Eq. (10)-(12)）。
从量化与 FP 运行对比中估计相关性 k 与不相关噪声统计量（算法 1）。
逐步感知的混合精度从集合 B 中为每一步选择激活比特宽度，以满足 SNR^Q(t) > SNR^F(t)（Eq. (13)-(15)）。

实验结果

研究问题

RQ1量化噪声如何影响扩散去噪步骤的平均值和方差？
RQ2将量化噪声统一分解为相关和不相关部分是否能改善扩散模型的 PTQ？
RQ3在不重新训练的情况下是否可以纠正偏差与方差以恢复采样质量？
RQ4逐步感知的混合精度是否在最大化加速的同时在去噪步骤中保持信噪比？

主要发现

将量化噪声分离为相关和不相关部分使得定向纠正成为可能。
相关噪声纠正（CNC）在消融实验中将 FID 降低 0.48、sFID 降低 6.55。
偏差纠正（BC）和方差日程校准（VSC）在消融实验中进一步将 FID 降低 0.2、sFID 降低 0.11。
PTQD 使用 W4A4/W4A8 混合精度实现 FID 6.44 和 sFID 8.43，仅比 FP 差 1.33 sFID，且节省 19.9x 的位操作。
在 ImageNet 256x256 上，与 250 步的全精度 LDM-4 相比，PTQD 将 FID 降低约 0.06，同时保持更小的模型尺寸和显著的 BOP 下降。
逐步感知的混合精度在各步骤维持更高的信噪比，使低比特扩散在不造成较大质量损失的情况下实际可用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。