QUICK REVIEW

[論文レビュー] PTQD: Accurate Post-Training Quantization for Diffusion Models

Yefei He, Luping Liu|arXiv (Cornell University)|May 18, 2023

Machine Learning in Materials Science被引用数 13

ひとこと要約

PTQDは拡散モデルの統合的ポストトレーニング量子化フレームワークを導入し、量子化ノイズを相関成分と非相関成分に分離して補正し、分散スケジュールを較正し、ステップ認識混合精度を用いてSNRを維持し、ほぼ全精度品質を達成しつつ大幅なビット演算のスピードアップを実現する。

ABSTRACT

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion models less practical for low-latency and scalable real-world applications. Post-training quantization (PTQ) of diffusion models can significantly reduce the model size and accelerate the sampling process without re-training. Nonetheless, applying existing PTQ methods directly to low-bit diffusion models can significantly impair the quality of generated samples. Specifically, for each denoising step, quantization noise leads to deviations in the estimated mean and mismatches with the predetermined variance schedule. As the sampling process proceeds, the quantization noise may accumulate, resulting in a low signal-to-noise ratio (SNR) during the later denoising steps. To address these challenges, we propose a unified formulation for the quantization noise and diffusion perturbed noise in the quantized denoising process. Specifically, we first disentangle the quantization noise into its correlated and residual uncorrelated parts regarding its full-precision counterpart. The correlated part can be easily corrected by estimating the correlation coefficient. For the uncorrelated part, we subtract the bias from the quantized results to correct the mean deviation and calibrate the denoising variance schedule to absorb the excess variance resulting from quantization. Moreover, we introduce a mixed-precision scheme for selecting the optimal bitwidth for each denoising step. Extensive experiments demonstrate that our method outperforms previous post-training quantized diffusion models, with only a 0.06 increase in FID score compared to full-precision LDM-4 on ImageNet 256x256, while saving 19.9x bit operations. Code is available at https://github.com/ziplab/PTQD.

研究の動機と目的

拡散モデルのPTQ（ポストトレーニング量子化）を再訓練なしで実現し、メモリと計算量を削減する必要性を動機づける。
量子化ノイズを拡散復元ノイズ Perturbations から分離する統一的ノイズモデルを開発する。
サンプリング時の相関量子化ノイズと非相関量子化ノイズの両方に対する補正機構を提供する。
denoisingステップ全体で高いSNRを維持するためのステップ認識混合精度戦略を導入する。

提案手法

明示的なノイズ表記を用いた一様量子化によるモデル量子化。
Eq. (7)のように、量子化ノイズを相関成分 k*epsilon_theta と非相関残差 (Delta epsilon_theta') に分離する。
相関ノイズ補正は相関成分出力を 1+k で除算することで行う（Eq. (9)）。
非相関ノイズ補正はバイアス補正（BC）と分散スケジュール較正（VSC）（Eq. (10)-(12)）を用いて行う。
量子化とFP実行の比較から相関 k と非相関ノイズ統計を推定する（Algorithm 1）。
Step-aware Mixed PrecisionはセットBから各ステップの活性化ビット幅を選択し、SNR^Q(t) > SNR^F(t) を満たすようにする（Eq. (13)-(15)）。

実験結果

リサーチクエスチョン

RQ1量子化ノイズは拡散復元の平均と分散にどのような影響を与えるか？
RQ2量子化ノイズを相関成分と非相関成分に統一的に分解することは、拡散モデルのPTQを改善するか？
RQ3再訓練なしでバイアスと分散を補正してサンプリング品質を回復できるか？
RQ4ステップ認識混合精度は分解を保ちながら、ステップ間でのSNRを維持しつつ速度向上を最大化できるか？

主な発見

量子化ノイズを相関成分と非相関成分に分離することで、ターゲット補正が可能になる。
Correlated Noise Correction (CNC) は ablations において FID を 0.48、sFID を 6.55 減らす。
Bias Correction (BC) と Variance Schedule Calibration (VSC) は ablations において FID をさらに 0.2、sFID を 0.11 減らす。
PTQDは W4A4/W4A8 混合精度で FID 6.44、sFID 8.43 を達成し、FP からわずか 1.33 のsFID差、同時にビット演算を 19.9x節約。
ImageNet 256x256 では、250 ステップのフル精度LDM-4と比較して PTQD は FID を約 0.06 減らしつつ、モデルサイズを大きく低減し、BOPも顕著に削減。
Step-aware Mixed Precision はステップ間でより高いSNRを維持し、実質的に低ビット拡散を大きな品質低下なしに実現する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。