[论文解读] Up or Down? Adaptive Rounding for Post-Training Quantization
AdaRound 是一种用于后训练量化的数据自适应权重舍入方法,在不进行微调的情况下通过逐层的 QUBO 公式、连续松弛和少量未标注数据,比最近舍入有更大改进。它在若干网络和任务上实现了新的最先进性能。
When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss. AdaRound is fast, does not require fine-tuning of the network, and only uses a small amount of unlabelled data. We start by theoretically analyzing the rounding problem for a pre-trained neural network. By approximating the task loss with a Taylor series expansion, the rounding task is posed as a quadratic unconstrained binary optimization problem. We simplify this to a layer-wise local loss and propose to optimize this loss with a soft relaxation. AdaRound not only outperforms rounding-to-nearest by a significant margin but also establishes a new state-of-the-art for post-training quantization on several networks and tasks. Without fine-tuning, we can quantize the weights of Resnet18 and Resnet50 to 4 bits while staying within an accuracy loss of 1%.
研究动机与目标
- Motivate and analyze why rounding-to-nearest may be suboptimal for post-training quantization in neural networks.
- Develop a theoretically grounded, efficient per-layer rounding method (AdaRound) that adapts to data and task loss without fine-tuning.
- Demonstrate AdaRound’s effectiveness across multiple networks and tasks, achieving high accuracy with 4-bit weights.
- Show that AdaRound outperforms existing post-training quantization methods and requires only small amounts of unlabeled data.
提出的方法
- Formulate weight rounding as a per-layer Quadratic Unconstrained Binary Optimization (QUBO) problem derived from a second-order Taylor expansion of the task loss.
- Introduce a diagonal-approximation of the Hessian to enable layer-wise optimization and reduce complexity.
- Relax the NP-hard QUBO with a continuous relaxation using soft quantization variables and a differentiable regularizer to encourage binarization.
- Use an asymmetric reconstruction loss and activation-aware formulation to better capture post-quantization effects.
- Optimize layer-by-layer with data-efficient optimization (small unlabeled dataset) and an AdaRound objective that can be solved via Hopfield-style continuous relaxation.
实验结果
研究问题
- RQ1Can weight rounding during post-training quantization be formulated as a per-layer optimization problem that accounts for data and task loss interactions?
- RQ2Does AdaRound, using a continuous relaxation and layer-wise optimization, outperform traditional rounding-to-nearest across multiple architectures and bit-widths?
- RQ3What is the impact of different design choices (Hessian approximation, local MSE loss, asymmetric reconstruction, activation awareness) on post-training quantization performance?
- RQ4How much unlabeled data is needed for AdaRound to achieve competitive accuracy, and does data domain affect performance?
主要发现
| Rounding | First layer (Acc %) | All layers (Acc %) |
|---|---|---|
| Nearest | 52.29 | 23.99 |
| H^w task loss (cf. (13)) | 68.62 ± 0.17 | N/A |
| Local MSE loss (cf. (20)) | 69.39 ± 0.04 | 65.83 ± 0.14 |
| Cont. relaxation (cf (21)) | 69.58 ± 0.03 | 66.56 ± 0.12 |
- AdaRound substantially improves over rounding-to-nearest for post-training quantization across several networks and tasks.
- Using a diagonal Hessian approximation and a local MSE objective yields competitive performance and enables feasible layer-wise optimization.
- Continuous relaxation with Hopfield-inspired optimization and an explicit regularizer yields strong performance and often outperforms STE-based methods.
- Asymmetric reconstruction and activation-aware loss formulations provide additional gains over the base AdaRound objective.
- AdaRound can quantize networks (e.g., ResNet-18/50, MobileNetV2, InceptionV3, DeepLabV3) to 4-bit weights with little to no accuracy loss (within ~1% for some cases), and requires only a small set of unlabeled data (as few as 256 images) to approach FP32 performance.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。