[论文解读] Scalable Methods for 8-bit Training of Neural Networks
通过 Range Batch-Normalization 和 Gradients Bifurcation,在训练神经网络完全使用8位是可行的,达到 ImageNet 规模的结果且没有精度损失。
Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to substantial precision reduction, and points to only a few specific operations that require higher precision. Armed with this knowledge, we quantize the model parameters, activations and layer gradients to 8-bit, leaving at a higher precision only the final step in the computation of the weight gradients. Additionally, as QNNs require batch-normalization to be trained at high precision, we introduce Range Batch-Normalization (BN) which has significantly higher tolerance to quantization noise and improved computational complexity. Our simulations show that Range BN is equivalent to the traditional batch norm if a precise scale adjustment, which can be approximated analytically, is applied. To the best of the authors' knowledge, this work is the first to quantize the weights, activations, as well as a substantial volume of the gradients stream, in all layers (including batch normalization) to 8-bit while showing state-of-the-art results over the ImageNet-1K dataset.
研究动机与目标
- 促使量化训练以在训练期间降低计算与内存需求。
- 提出权重、激活和大部分梯度的8位量化,同时保持准确性。
- 解决批量归一化和反向传播中的数值稳定性瓶颈。
- 引入 Range Batch-Normalization 作为标准 BN 的低精度替代方案。
- 展示在大模型实用训练(ImageNet)中使用量化反向传播。
提出的方法
- 用 Range BN 取代批量归一化,以容忍量化并避免高精度平方根运算。
- 将权重、激活和大量梯度量化到8位,同时将权重-梯度计算保持在16位。
- 引入 Gradients Bifurcation:在反向传播中对层梯度 g_l 使用8位进行计算,同时对更新时的 g_W 以16位并行计算。
- 对梯度量化使用随机四舍五入,以确保更新的偏差累积无偏。
- 对训练时量化应用 GEMMLOWP 量化方案。
- 在反向传播中采用 Straight-Through Estimator (STE) 对离散变量求导。
实验结果
研究问题
- RQ1在大规模数据集上,是否可以在训练过程中将权重、激活和大部分梯度应用8位量化而不降低准确性?
- RQ2如何将批量归一化和梯度计算适配为低精度以维持稳定性和性能?
- RQ3Range BN 在低精度训练下是否提供可与标准 BN 相当的准确性?
- RQ4量化反向传播对训练速度、内存和能效有何影响?
主要发现
- Range BN 能近似标准批量归一化,在 ImageNet(如 ResNet-50)和 CIFAR-10 的实验中取得可比的准确性。
- 大部分训练计算可在8位完成,只有最终权重梯度更新和一个16位的层梯度副本在较高精度下保留。
- Gradients bifurcation(8位 g_l 和16位 g_W)实现高效反向传播而不损害收敛性。
- 在8位激活/权重和16位梯度副本的量化反向传播下,对测试的模型在 ImageNet 的训练没有精度损失。
- 对量化梯度而言,随机四舍五入对收敛训练是必不可少的。
- Range BN 与8位训练相比全精度训练在硬件效率方面有显著优势(更快的 MAC、较低能耗)。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。