QUICK REVIEW

[论文解读] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Yihui He, Ji Lin|arXiv (Cornell University)|Feb 10, 2018

Machine Learning and Data Classification参考文献 92被引用 366

一句话总结

AMC 使用强化学习（DDPG）自动学习逐层压缩策略，在准确率/延迟权衡方面优于手工设计的方法，并在移动端和 GPU 硬件上实现显著加速。

ABSTRACT

Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted heuristics and rule-based policies that require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverage reinforcement learning to provide the model compression policy. This learning-based compression policy outperforms conventional rule-based compression policy by having higher compression ratio, better preserving the accuracy and freeing human labor. Under 4x FLOPs reduction, we achieved 2.7% better accuracy than the handcrafted model compression policy for VGG-16 on ImageNet. We applied this automated, push-the-button compression pipeline to MobileNet and achieved 1.81x speedup of measured inference latency on an Android phone and 1.43x speedup on the Titan XP GPU, with only 0.1% loss of ImageNet Top-1 accuracy.

研究动机与目标

在延迟和资源约束下，推动神经网络在移动设备上的高效部署。
自动化搜索每层的压缩策略，在硬件预算约束下最大化准确性。
展示该方法在不同网络架构（VGG、ResNet、MobileNet）和任务（从分类到检测）中的通用性。
提供两种针对资源约束和精度保证的奖励方案。

提出的方法

将模型压缩建模为逐层连续动作控制问题。
一个 DDPG 代理处理一个包含 11 个特征的层嵌入向量，输出在 (0,1] 内的精确稀疏比 a_t。
逐层进行压缩而不进行微调，以快速估计最终准确率。
奖励将准确率与硬件指标（FLOPs 或参数量）结合起来，并定义两种协议：资源约束和精度保证。
评估使用预训练网络，在策略搜索后进行最终微调以实现最佳性能。

实验结果

研究问题

RQ1强化学习代理能否发现优于手工启发式的逐层压缩策略？
RQ2连续的逐层稀疏动作是否比离散选择能实现更精细、效果更好的模型缩小？
RQ3AMC 是否能将压缩策略推广到不同的体系结构和任务（从分类到检测）？
RQ4资源约束和精度保证的奖励方案是否能在不牺牲性能的前提下可靠地达到目标预算？
RQ5在移动设备和 GPU 上的真实世界加速和准确率影响有哪些？

主要发现

在 FLOPs 降低至不足 4 倍的情况下，AMC 在 ImageNet 上对 VGG-16 的 top-1 准确率比手工策略高出 2.7%。
AMC 将 MobileNet 压缩至 Android 推理提速 1.81 倍，Titan XP GPU 推理提速 1.53 倍，ImageNet Top-1 损失仅 0.1%。
对于 ResNet-50，AMC 将专家调优的 3.4× 压缩提升至 5×，在 ImageNet 上不损失准确率。
AMC 在 Google Pixel 1 上实现 1.95× 加速，在移动端/GPU 上获得 1.53–1.95× 的提升，同时保持准确率，优于启发式方法。
AMC 泛化到目标检测：对 Faster R-CNN 使用 VGG-16 进行 4× 剪枝，在相同压缩下得到的 mAP 与手工剪枝相当或更好。
在 CIFAR-10 上，AMC 在 FLOPs 和参数预算方面领先于 Plain-20 和 ResNet-56 的手工策略。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。