QUICK REVIEW

[论文解读] Towards Efficient Training for Neural Network Quantization

Qing Jin, Linjie Yang|arXiv (Cornell University)|Dec 21, 2019

Advanced Neural Network Applications参考文献 56被引用 33

一句话总结

该论文提出尺度自适应训练（SAT）和梯度校准的PACT（CG-PACT），以实现高效率、高精度的量化神经网络，在 MobileNet 和 PreResNet-50 上取得最先进的结果。

ABSTRACT

Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the gradient propagation process of neural networks by viewing the weights and intermediate activations as random variables, we discover two critical rules for efficient training. Recent quantization approaches violates the two rules and results in degenerated convergence. To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training. We also analyze the quantization error introduced in calculating the gradient in the popular parameterized clipping activation (PACT) technique. Through SAT together with gradient-calibrated PACT, quantized models obtain comparable or even better performance than their full-precision counterparts, achieving state-of-the-art accuracy with consistent improvement over previous quantization methods on a wide spectrum of models including MobileNet-V1/V2 and PreResNet-50.

研究动机与目标

分析量化精度损失是来自容量下降还是量化过程中的训练过程问题。
通过研究量化网络中的梯度传播推导高效训练的规则。
提出 SAT 以在量化下保持训练动态。
研究梯度校准的 PACT（CG-PACT）以纠正激活量化中的梯度误差。
在 ImageNet 的量化下展示 MobileNet-V1/V2 和 PreResNet-50 的最先进性能。

提出的方法

分析卷积/线性层中的有效权重梯度传播，以推导两条高效训练规则（ETR I 和 ETR II）。
提出 scale-adjusted training (SAT) 以在钳位和量化后保持权重方差的适当性。
在权重钳位和 SAT 下评估 DoReFa 权重量化，以保持训练动态。
引入 CG-PACT 以纠正 PACT 激活量化器的梯度计算。
结合 SAT 与 CG-PACT，在 ImageNet 上对 MobileNet-V1/V2 和 PreResNet-50 进行基准测试，以与先前的量化方法进行比较。

实验结果

研究问题

RQ1量化网络的准确度下降主要是由于容量下降还是量化过程中的训练效率低下？
RQ2确保量化网络收敛且高效训练所需的规则有哪些？
RQ3在权重被钳位或量化时，SAT 能否恢复高效训练？
RQ4纠正 PACT 的激活梯度（CG-PACT）是否提升低精度训练？
RQ5SAT 和 CG-PACT 是否在 ImageNet 上的流行架构中实现最先进的结果？

主要发现

SAT 通过在钳位和量化后将权重方差维持在合适范围内来帮助维持高效训练。
在 MobileNet-V1/V2 和 PreResNet-50 上，带 SAT 的量化模型可以达到与全精度对应模型相当甚至更好的性能。
CG-PACT 通过对 PACT 剪裁参数的梯度进行标定来在低精度下改善训练。
将 SAT 和 CG-PACT 结合在多种模型上在量化下获得最先进的准确性。
量化可以作为一种有益的正则化，且在某些设置下激活量化（activation quantization）通常会改善泛化。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。