[论文解读] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning
论文通过考察比较设置中的公平性以及被忽视的训练性角色,分析了剪枝基准为何混乱,表明微调学习率在很大程度上驱动报告的增益,在某些公平基准下剪枝可能并非必要。
The state of neural network pruning has been noticed to be unclear and even confusing for a while, largely due to "a lack of standardized benchmarks and metrics" [3]. To standardize benchmarks, first, we need to answer: what kind of comparison setup is considered fair? This basic yet crucial question has barely been clarified in the community, unfortunately. Meanwhile, we observe several papers have used (severely) sub-optimal hyper-parameters in pruning experiments, while the reason behind them is also elusive. These sub-optimal hyper-parameters further exacerbate the distorted benchmarks, rendering the state of neural network pruning even more obscure. Two mysteries in pruning represent such a confusing status: the performance-boosting effect of a larger finetuning learning rate, and the no-value argument of inheriting pretrained weights in filter pruning. In this work, we attempt to explain the confusing state of network pruning by demystifying the two mysteries. Specifically, (1) we first clarify the fairness principle in pruning experiments and summarize the widely-used comparison setups; (2) then we unveil the two pruning mysteries and point out the central role of network trainability, which has not been well recognized so far; (3) finally, we conclude the paper and give some concrete suggestions regarding how to calibrate the pruning benchmarks in the future. Code: https://github.com/mingsun-tse/why-the-state-of-pruning-so-confusing.
研究动机与目标
- 明确神经网络剪枝实验中何为公平对比。
- 调研并形式化文献中使用的主要剪枝对比设置。
- 解开两个剪枝“谜团”(M1:微调学习率的影响;M2:剪枝的价值),并将其与网络可训练性联系起来。
- 强调可训练性如何解释在不同基准下观察到的性能差距。
- 提出具体建议以在未来对剪枝基准进行校准和标准化。
提出的方法
- 回顾并将剪枝实验设置分类为以公平为驱动的框架。
- 系统分析不同的微调学习率调度如何影响剪枝性能。
- 在严格受控的设置下(包括 S4.2、SX-A、SX-B)进行经验比较:剪枝与从头训练。
- 使用 ImageNet/ ImageNet100 上的 ResNet34/ResNet50 来再现并量化超参数的影响,特别是微调 LR。
- 呈现表格化结果(如 L1-norm 剪枝 vs 从头训练)以说明基准选择如何改变结论。
实验结果
研究问题
- RQ1在神经网络剪枝实验中,什么构成公平对比设置?
- RQ2不同的微调学习率调度如何影响对剪枝方法有效性的感知?
- RQ3在考虑公平再训练成本时,继承预训练权重的滤波剪枝是否确实有价值?
- RQ4各种基准设置(S2、S3.x、S4.x、SX)如何影响对剪枝方法和从头训练的结论?
主要发现
- 更大的微调学习率调度可以显著提升剪枝性能,在相同再训练配置下甚至可能与更复杂的剪枝方法相媲美或超越(M1)。
- 对剪枝价值的感知(M2)取决于对比设置;在允许更大微调LR的严格公平设置下,“无价值”主张会减弱或消失。
- 网络可训练性在剪枝结果中起核心作用;将可训练性考虑在内解释了为何在微调配置正确时,简单的 L1-范数剪枝也能匹配现代方法。
- 该领域存在基准实践不一致的问题(基线模型、微调轮数、学习率调度等的差异),这带来混乱并削弱进展。
- 严格的公平性原则(如 S4.2、SX-A、SX-B)通过控制微调和剪枝成本,提供更可靠的比较。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。