QUICK REVIEW

[논문 리뷰] PTQD: Accurate Post-Training Quantization for Diffusion Models

Yefei He, Luping Liu|arXiv (Cornell University)|2023. 05. 18.

Machine Learning in Materials Science인용 수 13

한 줄 요약

PTQD는 확산 모델에 대한 통합된 post-training quantization 프레임워크를 도입하여 양자화 잡음을 상관된 부분과 비상관된 부분으로 분리하고 이를 보정하며 분산 스케줄을 보정하고, 단계 인식 혼합 정밀도를 사용하여 SNR을 유지함으로써 거의 풀 정밀도 품질에 달하고 상당한 비트 연산 속도 향상을 달성한다.

ABSTRACT

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion models less practical for low-latency and scalable real-world applications. Post-training quantization (PTQ) of diffusion models can significantly reduce the model size and accelerate the sampling process without re-training. Nonetheless, applying existing PTQ methods directly to low-bit diffusion models can significantly impair the quality of generated samples. Specifically, for each denoising step, quantization noise leads to deviations in the estimated mean and mismatches with the predetermined variance schedule. As the sampling process proceeds, the quantization noise may accumulate, resulting in a low signal-to-noise ratio (SNR) during the later denoising steps. To address these challenges, we propose a unified formulation for the quantization noise and diffusion perturbed noise in the quantized denoising process. Specifically, we first disentangle the quantization noise into its correlated and residual uncorrelated parts regarding its full-precision counterpart. The correlated part can be easily corrected by estimating the correlation coefficient. For the uncorrelated part, we subtract the bias from the quantized results to correct the mean deviation and calibrate the denoising variance schedule to absorb the excess variance resulting from quantization. Moreover, we introduce a mixed-precision scheme for selecting the optimal bitwidth for each denoising step. Extensive experiments demonstrate that our method outperforms previous post-training quantized diffusion models, with only a 0.06 increase in FID score compared to full-precision LDM-4 on ImageNet 256x256, while saving 19.9x bit operations. Code is available at https://github.com/ziplab/PTQD.

연구 동기 및 목표

확산 모델의 메모리 및 계산 비용을 재학습 없이 줄이기 위한 post-training quantization (PTQ)의 필요성에 대한 동기 부여.
양자화 잡음을 확산 디노이징 미세 perturbations와 구분하는 통합 노이즈 모델 개발.
샘플링 도중 상관 양자화 잡음과 비상관 양자화 잡음에 대한 보정 메커니즘 제공.
샘의 각 단계에서 높은 SNR을 유지하기 위한 단계 인식 혼합 정밀도 전략 도입.

제안 방법

노이즈 표기와 함께 균일 양자화를 사용하는 모델 양자화.
Eq. (7)에서 보듯 양자화 잡음을 상관된 부분 k*epsilon_theta와 비상관 잔차 (Delta epsilon_theta')로 분해한다.
상관 잡음 보정: 상관 구성요소 출력을 1+k로 나누는 방식(CNC)으로 보정( Eq. (9)).
비상관 잡음 보정: 편향 보정 (BC) 및 분산 스케줄 보정 (VSC) (Eq. (10)-(12)).
양자화된 실행과 FP 실행에서 상관 k와 비상관 잡음 통계를 추정(Algorithm 1).
단계 인식 혼합 정밀도는 단계별 활성화 비트폭을 집합 B에서 선택하여 SNR^Q(t) > SNR^F(t)를 만족시키도록 한다(Eq. (13)-(15)).

실험 결과

연구 질문

RQ1양자화 잡음이 확산 디노이징 단계에서 평균 및 분산에 어떤 영향을 미치는가?
RQ2양자화 잡음을 상관된 부분과 비상관된 부분으로 통합적으로 분해하는 것이 확산 모델의 PTQ를 개선할 수 있는가?
RQ3샘플링 품질을 재학습 없이 보정하고 회복할 수 있는가?
RQ4단계 인식 혼합 정밀도가 샘네스 단계 전반에 걸쳐 SNR을 유지하며 속도 향상을 극대화하는가?

주요 결과

양자화 잡음을 상관된 부분과 비상관된 부분으로 분리하면 대상 보정이 용이해진다.
Correlated Noise Correction (CNC) ablations에서 FID가 0.48만큼, sFID가 6.55만큼 감소한다.
Bias Correction (BC) 및 Variance Schedule Calibration (VSC)은 ablations에서 FID를 각각 0.2 및 sFID를 0.11만큼 더 감소시킨다.
PTQD는 W4A4/W4A8 혼합 정밀도로 FID 6.44 및 sFID 8.43을 달성하고, FP보다 1.33 sFID만큼 못하지만 비트 연산은 19.9x 절감한다.
ImageNet 256x256에서 PTQD는 250 스텝의 풀 정밀 LDM-4에 비해 FID를 약 0.06만큼 감소시키면서도 모델 크기가 훨씬 작고 BOP를 크게 줄인다.
단계 인식 혼합 정밀도는 스텝 전반에 걸쳐 더 높은 SNR을 유지하여 품질 저하 없이 사실상 저비트 확산을 가능하게 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.