QUICK REVIEW

[论文解读] EfficientNetV2: Smaller Models and Faster Training

Mingxing Tan, Quoc V. Le|arXiv (Cornell University)|Apr 1, 2021

Advanced Neural Network Applications参考文献 41被引用 1,120

一句话总结

引入 EfficientNetV2，这是通过训练感知 NAS 和自适应正则化的渐进学习发现的一族更小、收敛更快的卷积神经网络，达到比先前模型更高的准确性，使用更少的参数并有更快的训练速度。

ABSTRACT

This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models. To develop this family of models, we use a combination of training-aware neural architecture search and scaling, to jointly optimize training speed and parameter efficiency. The models were searched from the search space enriched with new ops such as Fused-MBConv. Our experiments show that EfficientNetV2 models train much faster than state-of-the-art models while being up to 6.8x smaller. Our training can be further sped up by progressively increasing the image size during training, but it often causes a drop in accuracy. To compensate for this accuracy drop, we propose to adaptively adjust regularization (e.g., dropout and data augmentation) as well, such that we can achieve both fast training and good accuracy. With progressive learning, our EfficientNetV2 significantly outperforms previous models on ImageNet and CIFAR/Cars/Flowers datasets. By pretraining on the same ImageNet21k, our EfficientNetV2 achieves 87.3% top-1 accuracy on ImageNet ILSVRC2012, outperforming the recent ViT by 2.0% accuracy while training 5x-11x faster using the same computing resources. Code will be available at https://github.com/google/automl/tree/master/efficientnetv2.

研究动机与目标

在卷积神经网络中同时提升训练效率与参数效率的动机与改进。
探索 EfficientNet 训练中的瓶颈并确定加速训练的架构选择。
开发一个训练感知的 NAS 与放缩框架，联合优化准确度、速度和参数数量。
提出带自适应正则化的渐进学习，在训练时增大图像尺寸的同时保持准确性。
在 ImageNet21k 上进行高效预训练，展示在 ImageNet 及迁移学习任务上的强大性能。

提出的方法

分析 EfficientNet（V1）中的训练瓶颈并确定改进方向。
在 MBConv 和 Fused-MBConv 模块的搜索空间中进行扩展并执行训练感知 NAS。
应用非均匀、阶段性缩放策略并限制最大训练图像尺寸。
引入带自适应正则化的渐进学习以在不损失准确性的情况下加速训练。
在 ImageNet21k 上进行预训练并在 ImageNet ILSVRC2012 及下游数据集上进行微调以评估泛化。

实验结果

研究问题

RQ1训练感知的 NAS 是否能为卷积神经网络同时优化准确性、训练速度和参数效率？
RQ2MBConv 和 Fused-MBConv 模块，以及非均匀缩放，是否能在不牺牲准确性的前提下实现更快的训练和更小的模型？
RQ3带自适应正则化的渐进学习是否能在 ImageNet 和迁移任务上提高训练速度，同时保持或提升准确性？
RQ4EfficientNetV2 相较于先前的卷积神经网络和 ViT 在训练速度、参数效率和推理延迟方面的表现如何？
RQ5ImageNet21k 预训练对下游迁移学习性能有何影响？

主要发现

模型	Top-1 准确率	参数量	FLOPs	推理时间（ms）	训练时间（小时）
EfficientNetV2-S	83.9%	22M	8.8B	24	7.1
EfficientNetV2-M	85.1%	54M	24B	57	13
EfficientNetV2-L	85.7%	120M	53B	98	24
EfficientNetV2-XL	87.3%	208M	94B	-	45

EfficientNetV2 模型在 ImageNet 上的训练速度比以前的模型快 5x–11x，且参数量最多比以前的模型小 6.8x。
包含 MBConv 与 Fused-MBConv 的搜索空间产生的训练感知 NAS 能给出在训练速度和参数效率方面超过 EfficientNet 的 EfficientNetV2 架构。
带自适应正则化的渐进学习显著加速训练，并能在 ImageNet 与迁移数据集上提高或保持准确性。
EfficientNetV2-M 使用相同资源进行训练，达到与 EfficientNet-B7 相当的准确率但训练速度快 11x。
在 ImageNet21k 预训练下，EfficientNetV2-L（21k）达到 87.3% 的 top-1，在训练速度比 ViT-L/16(21k) 快 5x–11x 的情况下超越了它。
EfficientNetV2 在 CIFAR-10、CIFAR-100、Flowers 和 Cars 上展示出强的迁移学习性能，相较于先前的卷积神经网络和 ViT。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。