QUICK REVIEW

[论文解读] Accelerator-aware Neural Network Design using AutoML

Suyog Gupta, Berkin Akin|arXiv (Cornell University)|Mar 5, 2020

CCD and CMOS Imaging Sensors参考文献 11被引用 49

一句话总结

该论文提出面向加速器的神经架构搜索（NAS），用于为 Edge TPU 设计面向边缘的优化视觉模型，生成 EfficientNet-EdgeTPU 和 MobilenetEdgeTPU，在 Coral 和 Pixel 4 设备上的延迟-准确性权衡有所提升。它将延迟建模、硬件感知搜索空间设计和 NAS 结合起来，以实现对特定加速器的定制模型。

ABSTRACT

While neural network hardware accelerators provide a substantial amount of raw compute throughput, the models deployed on them must be co-designed for the underlying hardware architecture to obtain the optimal system performance. We present a class of computer vision models designed using hardware-aware neural architecture search and customized to run on the Edge TPU, Google's neural network hardware accelerator for low-power, edge devices. For the Edge TPU in Coral devices, these models enable real-time image classification performance while achieving accuracy typically seen only with larger, compute-heavy models running in data centers. On Pixel 4's Edge TPU, these models improve the accuracy-latency tradeoff over existing SoTA mobile models.

研究动机与目标

在资源受限的硬件上推动隐私保护、快速响应的端上推理。
证明硬件感知的 NAS 能通过与目标加速器的协同设计架构，优于手工设计的移动模型。
开发延迟估算方法和一个硬件感知的搜索框架，以优化准确性和延迟。
定制搜索空间以包含最大化 Edge TPU 利用率的模块，同时排除不兼容的操作。

提出的方法

扩展 NAS，加入一个加速器性能预测器，用于估算在目标硬件上的延迟。
使用周期级精度的 Edge TPU 模拟器来估算模型延迟，以及一个用于快速延迟估计的分析性能模型（APM）。
将延迟并入 NAS 目标，与准确性一起形成多目标奖励。
使用硬件感知块，如融合的逆瓶颈卷积，来提高利用率，设计搜索空间。
排除生产 Edge TPU 软件不支持的操作，以确保可部署性。
使用 EfficientNet 风格的复合缩放来缩放架构，生成变体 (-S, -M, -L)。

实验结果

研究问题

RQ1加速器感知的 NAS 是否能够发现在 Edge TPU 硬件上优于手工调谐的移动架构的模型？
RQ2延迟估算方法（逐周期仿真与分析模型）如何影响 NAS 的效率与结果？
RQ3哪些架构块在 Coral 与 Pixel 4 部署中能最大化 Edge TPU 的利用率和准确性？
RQ4在延迟、准确性和可部署性方面，搜索得到的模型与基线移动网络和高效网络相比如何？

主要发现

EfficientNet-EdgeTPU-S/M/L 在 Coral 设备的 Edge TPU 上比 ResNet50 和 Inception 拿到更快的运行时间和更高的准确率。
MobilenetEdgeTPU 模型在 Pixel 4 Edge TPU 上达到 75.6% 的 top-1 准确率，延迟比 MobilenetV3 低 30%。
NAS 生成的模型相比现有移动模型，在 Edge TPU 目标上的准确性-延迟帕累托前沿有所改进。
在 Pixel 4 上，MobilenetEdgeTPU 的运行延迟特性与在 Coral 上不同，突显了对硬件特定搜索空间的需求。
面向加速器的 NAS 通过针对特定 Edge TPU 变体和生产栈对模型进行定制，减少了手动架构工程。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。