[论文解读] Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights
INQ 将预训练的全精度 CNN 转换为权重为 2 的幂次方或零的低精度模型,采用权重分区、组级量化和再训练,在主流架构上以 5 位实现无损或更高精度(并在 4/3/2 位上具竞争力)。
This paper presents incremental network quantization (INQ), a novel method, targeting to efficiently convert any pre-trained full-precision convolutional neural network (CNN) model into a low-precision version whose weights are constrained to be either powers of two or zero. Unlike existing methods which are struggled in noticeable accuracy loss, our INQ has the potential to resolve this issue, as benefiting from two innovations. On one hand, we introduce three interdependent operations, namely weight partition, group-wise quantization and re-training. A well-proven measure is employed to divide the weights in each layer of a pre-trained CNN model into two disjoint groups. The weights in the first group are responsible to form a low-precision base, thus they are quantized by a variable-length encoding method. The weights in the other group are responsible to compensate for the accuracy loss from the quantization, thus they are the ones to be re-trained. On the other hand, these three operations are repeated on the latest re-trained group in an iterative manner until all the weights are converted into low-precision ones, acting as an incremental network quantization and accuracy enhancement procedure. Extensive experiments on the ImageNet classification task using almost all known deep CNN architectures including AlexNet, VGG-16, GoogleNet and ResNets well testify the efficacy of the proposed method. Specifically, at 5-bit quantization, our models have improved accuracy than the 32-bit floating-point references. Taking ResNet-18 as an example, we further show that our quantized models with 4-bit, 3-bit and 2-bit ternary weights have improved or very similar accuracy against its 32-bit floating-point baseline. Besides, impressive results with the combination of network pruning and INQ are also reported. The code is available at https://github.com/Zhouaojun/Incremental-Network-Quantization.
研究动机与目标
- 推动在低位宽下对 CNN 进行无损量化,以实现像 FPGA 这样的高效硬件。
- 引入相互依赖的三大操作——权重分区、组级量化和再训练——以尽量减少精度损失。
- 开发一种增量训练策略,对权重分组量化的同时对其余权重进行再训练以恢复精度。
- 在 ImageNet 上展示在多种架构上的适用性,并探索与剪枝结合以实现更深层次的压缩。
提出的方法
- 将权重量化到集合 P_l,其中包含范围内的 ±2^n 与 0,使用变长编码表示最多 2^(b-1)+1 个值再加上零。
- 在每一层将权重分成两个互不相交的组,使用受剪枝启发的度量来形成低精度基底和可再训练的补偿组。
- 迭代地将量化应用于一个组,同时对另一个组进行再训练,更新二进制掩码 T_l 以固定量化权重,只允许未量化的权重更新。
- 量化使用一串幂次的 2 和基于相邻阶梯值的取整规则,将权重映射到最近的量子值(方程式 4)。
- 预期的比特宽度 b 决定量化级别的数量;n1 根据权重的最大幅值计算并用于推导 n2,且每一步的最终量化由累计量化部分 σ_n 决定。
- 再训练遵循带掩码更新的 SGD:W_l ← W_l − γ ∂E/∂W_l · T_l,其中 T_l 掩蔽已量化的权重。
实验结果
研究问题
- RQ1一个顺序的、分组的量化调度是否能在大型 CNN 上相对于全精度基线保持或提升准确率?
- RQ2在低位量化下,权重分区策略(剪枝启发式 vs 随机)对最终准确率有何影响?
- RQ3在 ImageNet 架构上,哪些位宽(4/3/2 位)对无损或近无损量化是可行的?
- RQ4INQ 如何与网络剪枝协同以在最大化压缩的同时保持准确性?
主要发现
| 网络 | 位宽 | Top-1 误差 | Top-5 误差 | Top-1 减少量 / Top-5 减少量 |
|---|---|---|---|---|
| AlexNet | 32 (ref) | 42.76% | 19.77% | - |
| AlexNet | 5 | 42.61% | 19.54% | 0.15% / 0.23% |
| VGG-16 | 32 (ref) | 31.46% | 11.35% | - |
| VGG-16 | 5 | 29.18% | 9.70% | 2.28% / 1.65% |
| GoogleNet | 32 (ref) | 31.11% | 10.97% | - |
| GoogleNet | 5 | 30.98% | 10.72% | 0.13% / 0.25% |
| ResNet-18 | 32 (ref) | 31.73% | 11.31% | - |
| ResNet-18 | 5 | 31.02% | 10.90% | 0.71% / 0.41% |
| ResNet-50 | 32 (ref) | 26.78% | 8.76% | - |
| ResNet-50 | 5 | 25.19% | 7.55% | 1.59% / 1.21% |
- INQ 在 AlexNet、VGG-16、GoogleNet、ResNet-18 和 ResNet-50 上以 5 位量化实现无损或更高的 Top-1/Top-5 准确率(示例包括 Top-1 提升 0.13–2.28%,Top-5 提升 0.23–1.65%)。
- 在实践中方法易于收敛,对于 5 位模型,在每个量化步骤通常少于 8 个再训练轮次。
- ResNet-18 的 4 位、3 位和 2 位三值权重显示出相对于 32 位基线的更新或几乎相同的精度(例如,5 位结果也延展到 4/3/2 位情形)。
- 在 ResNet-18 的 5 位 INQ 上,剪枝启发的权重分区优于随机分区(Top-1/Top-5 提升 1.09%/0.83%)。
- 将 INQ 与动态网络剪枝(DNS)结合可获得显著的压缩提升(例如,在 AlexNet 上 5 位量化可达到 53×),相比深度压缩基线,精度变化最小或略有提升。
- 与单独的向量量化相比,INQ 在保持精度的同时实现了更低位表示,适用于所测试的网络。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。