QUICK REVIEW

[论文解读] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Mingxing Tan, Quoc V. Le|arXiv (Cornell University)|May 28, 2019

Advanced Neural Network Applications参考文献 50被引用 5,012

一句话总结

本文介绍了一种原理性复合缩放方法，统一地缩放深度、宽度和分辨率，产生一系列 EfficientNets，其准确性和效率远超以往的卷积神经网络，起点为 NAS 设计的基线 EfficientNet-B0。

ABSTRACT

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet. To go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves state-of-the-art 84.3% top-1 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than the best existing ConvNet. Our EfficientNets also transfer well and achieve state-of-the-art accuracy on CIFAR-100 (91.7%), Flowers (98.8%), and 3 other transfer learning datasets, with an order of magnitude fewer parameters. Source code is at https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet.

研究动机与目标

在固定资源预算下，推动ConvNet在多维度缩放上的高效性并提升准确性。
提出一种复合缩放方法，在固定资源曲线下平衡深度、宽度和分辨率。
通过神经架构搜索（EfficientNet-B0）开发一个移动尺寸的基线网络。
证明复合缩放能够产生一组模型（EfficientNets），在 ImageNet 及迁移任务上超越先前的卷积网络。

提出的方法

将 ConvNet 表述为在所有层上进行统一缩放的一组层的堆叠。
引入带系数的复合缩放（phi, alpha, beta, gamma），其中 d=alpha^phi, w=beta^phi, r=gamma^phi，且 alpha*beta^2*gamma^2≈2。
通过多目标神经架构搜索在准确性和 FLOPS 上进行优化来构建 EfficientNet-B0。
使用固定的缩放系数对 EfficientNet-B0 进行缩放，以获得 EfficientNet-B1 到 EfficientNet-B7。
使用训练技巧（SiLU/Swish 激活、AutoAugment、随机深度、 dropout 调度）来提高性能。
在 ImageNet 和迁移数据集上，与基线（ResNet、GPipe、NASNet 等）进行比较。

实验结果

研究问题

RQ1以协调、原理性方式同时缩放 ConvNet 的多个维度（深度、宽度、分辨率）是否能够同时提升准确性和效率？
RQ2在采用有原理的方法缩放时，哪一种最小基线架构能够在参数和 FLOPS 更少的情况下实现最先进的性能？
RQ3与现有架构相比，缩放后的 EfficientNets 在迁移到其他数据集上的表现如何？
RQ4是否存在一个可操作的缩放规则，能在不同模型尺寸和硬件约束下通用？

主要发现

EfficientNet-B7 在 ImageNet 上以 66M 参数和 37B FLOPS 达到 84.3% 的 top-1 准确率，参数减少 8.4 倍、推理速度提升 6.1 倍，超越 GPipe。
EfficientNet 模型在参数和 FLOPS 上提供显著下降（参数最多可低至 8.4 倍，FLOPS 最多可低至 16 倍），在相当甚至更高的准确性下优于先前的卷积网络。
EfficientNet-B4 在 FLOPS 相近的情况下，将 ResNet-50 的 top-1 准确率从 76.3% 提升到 83.0%。
在迁移学习中，EfficientNets 在 8 个数据集中的 5 个上达到最先进的准确性，参数比此前最佳结果少最高可达 21 倍。
在 CPU 延迟测量中，EfficientNet-B1 比 ResNet-152 快 5.7 倍，EfficientNet-B7 比 GPipe 快 6.1 倍。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。