QUICK REVIEW

[论文解读] Training Quantized Nets: A Deeper Understanding

Hao Li, Soham De|arXiv (Cornell University)|Jun 7, 2017

Adversarial Robustness in Machine Learning参考文献 10被引用 94

一句话总结

这篇论文从理论角度分析训练量化神经网络，比较随机舍入（stochastic rounding）和 BinaryConnect，证明收敛性保证并解释为何全量化方法在贪婪优化方面困难，同时在 CIFAR-10/100 和 ImageNet 上给出实验。

ABSTRACT

Currently, deep neural networks are deployed on low-power portable devices by first training a full-precision model using powerful hardware, and then deriving a corresponding low-precision model for efficient inference on such systems. However, training models directly with coarsely quantized weights is a key step towards learning on embedded platforms that have limited computing resources, memory capacity, and power consumption. Numerous recent publications have studied methods for training quantized networks, but these studies have mostly been empirical. In this work, we investigate training methods for quantized neural networks from a theoretical viewpoint. We first explore accuracy guarantees for training methods under convexity assumptions. We then look at the behavior of these algorithms for non-convex problems, and show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic.

研究动机与目标

直接从头开始为嵌入式、低精度硬件动机并分析训练量化神经网络。
在凸和非凸设定下，为随机舍入（SR）和 BinaryConnect（BC）建立理论收敛结果。
解释为何保留浮点表示的 BC 能促进优化，而完全量化的方法（SR）则停滞。
比较非凸问题中 SR 与 BC 的行为，以理解探索-利用动态。
在标准架构和数据集上提供实证验证，以说明理论。

提出的方法

在更新阶段应用一个量化算子 Q（确定性或随机）来表述量化训练。
证明 SR 的收敛结果，在凸设定下显示与量化水平 Δ 成比例的精度下界。
证明 BC 的收敛结果，在目标函数强凸或二次时表现改善，并量化误差下界。
分析非凸行为，以对比 SR 缺乏贪婪利用与 BC 的类似退火的改进。
在 CIFAR-10、CIFAR-100 和 ImageNet 上，使用 SR-ADAM、BC-ADAM、R-ADAM 和 Big SR-ADAM，对带二值权重的 VGG/ResNet 变体进行训练的实验。

实验结果

研究问题

RQ1在从零开始训练量化网络时，SR 和 BC 能否收敛？
RQ2在凸和非凸目标下，SR 与 BC 的精度下界和收敛速率是多少？
RQ3为何 BC 在神经网络训练中常常优于像 SR 这样的全量化方法？
RQ4在非凸优化过程中，SR 与 BC 的探索-利用动态有何不同？
RQ5标准架构的实证结果是否与量化训练理论预测一致？

主要发现

CIFAR-10	CIFAR-100	ImageNet	WRN-56-2	ResNet-56	ResNet-18
7.97	7.12	8.10	6.62	33.98	36.04
10.36	8.21	8.83	7.17	35.34	52.11
16.95	16.77	19.84	16.04	50.79	77.68
23.33	20.56	26.49	21.58	58.06	88.86
23.99	21.88	33.56	27.90	68.39	91.07

SR 与 BC 在凸设定下收敛到接近最小值的 O(Δ) 精度，精度下界取决于量化水平。
在二次（或近二次）问题上，BC 可以收敛到真实最小值，而 SR 在与细化步长无关的精度下界处停滞。
在非凸问题中，SR 缺乏帮助 BC 的贪婪利用阶段——当学习率收缩时，BC 的迭代会收敛到接近最小值，而 SR 却停滞。
实验表明在某些情况下 BC-ADAM 与全精度 ADAM 相匹配，而 SR-ADAM 与 R-ADAM 表现不佳，与理论一致。
大批量的 SR 通过提高探索效率来改善性能，解决 SR 不能用小学习率充分利用局部极小点的问题。
SR 相比 BC 在不同网络和数据集上倾向于进行更多权重变化的探索，与探索-利用动态的理论预测一致。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。