Skip to main content
QUICK REVIEW

[论文解读] NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

Tien-Ju Yang, Andrew Howard|arXiv (Cornell University)|Apr 9, 2018
Advanced Neural Network Applications参考文献 22被引用 72
一句话总结

NetAdapt 自动通过直接指标(如延迟)并通过经验测量,自动简化预训练的 DNN 以满足移动平台的资源预算,并在最大化准确度的同时优化精准度。

ABSTRACT

This work proposes an algorithm, called NetAdapt, that automatically adapts a pre-trained deep neural network to a mobile platform given a resource budget. While many existing algorithms simplify networks based on the number of MACs or weights, optimizing those indirect metrics may not necessarily reduce the direct metrics, such as latency and energy consumption. To solve this problem, NetAdapt incorporates direct metrics into its adaptation algorithm. These direct metrics are evaluated using empirical measurements, so that detailed knowledge of the platform and toolchain is not required. NetAdapt automatically and progressively simplifies a pre-trained network until the resource budget is met while maximizing the accuracy. Experiment results show that NetAdapt achieves better accuracy versus latency trade-offs on both mobile CPU and mobile GPU, compared with the state-of-the-art automated network simplification algorithms. For image classification on the ImageNet dataset, NetAdapt achieves up to a 1.7$ imes$ speedup in measured inference latency with equal or higher accuracy on MobileNets (V1&V2).

研究动机与目标

  • Motivate the need for platform-aware network adaptation that uses direct resource metrics instead of indirect proxies.
  • Propose NetAdapt, an automatic iterative algorithm that adapts a pretrained network to meet a latency budget while maximizing accuracy.
  • Show that NetAdapt outperforms state-of-the-art automated simplification methods on MobileNets across CPU and GPU mobile platforms.

提出的方法

  • Formulate the adaptation as a non-convex constrained optimization, maximizing accuracy under resource budgets.
  • Iteratively generate network proposals by removing filters from individual layers and evaluating them on the target platform using empirical measurements.
  • Use a resource reduction schedule to tighten constraints across iterations and select the highest-accuracy network at each step.
  • Employ short-term and long-term fine-tuning to recover accuracy after simplifications.
  • Estimate resource consumption quickly via layer-wise look-up tables built from empirical measurements, enabling fast guidance of proposals.

实验结果

研究问题

  • RQ1Can direct metrics measured on target mobile platforms (e.g., latency) lead to better accuracy-latency trade-offs than indirect proxies (e.g., MACs)?
  • RQ2How effectively can a pretrained network be automatically adapted to meet a latency budget without platform-specific modeling?
  • RQ3What is the impact of short-term vs. long-term fine-tuning in NetAdapt’s adaptation process?
  • RQ4Do layer-wise filter removals guided by empirical measurements yield scalable improvements across MobileNet variants?

主要发现

网络Top-1 Accuracy (%)MACs (×10^6)Latency (ms)
25% MobileNetV1 (128) [9]45.113.64.65
MorphNet [5]46.015.06.52
NetAdapt46.311.06.01
  • NetAdapt outperforms state-of-the-art automatic network simplification methods by up to 1.7x in measured inference latency with equal or higher accuracy on mobile CPUs and GPUs.
  • For ImageNet and MobileNetV1, NetAdapt achieved up to 1.7x speedup in latency with comparable or higher accuracy on MobileNets (V1 & V2).
  • Experiments show NetAdapt provides better accuracy-latency trade-offs across mobile CPU and GPU platforms compared with multipliers, MorphNet, and ADC.
  • A family of simplified networks with different accuracy–latency trade-offs is produced, enabling dynamic network selection.
  • Ablations demonstrate the importance of direct metrics, short-/long-term fine-tuning, and resource-reduction schedules in achieving performance gains.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。