QUICK REVIEW

[论文解读] CARS: Continuous Evolution for Efficient Neural Architecture Search

Zhaohui Yang, Yunhe Wang|arXiv (Cornell University)|Sep 11, 2019

Advanced Neural Network Applications参考文献 63被引用 26

一句话总结

CARS 提出了一种连续进化神经架构搜索框架，通过参数共享的 SuperNet 和非支配排序（pNSGA-III）高效地在模型大小与准确率之间演化出多样化架构。通过在代际间复用 SuperNet 和种群，CARS 仅用 0.4 GPU 天即在 ImageNet 上实现了最先进性能，生成的模型参数量从 3.7M 到 5.1M 不等，其准确率与效率均超越现有 SOTA 方法。

ABSTRACT

Searching techniques in most of existing neural architecture search (NAS) algorithms are mainly dominated by differentiable methods for the efficiency reason. In contrast, we develop an efficient continuous evolutionary approach for searching neural networks. Architectures in the population that share parameters within one SuperNet in the latest generation will be tuned over the training dataset with a few epochs. The searching in the next evolution generation will directly inherit both the SuperNet and the population, which accelerates the optimal network generation. The non-dominated sorting strategy is further applied to preserve only results on the Pareto front for accurately updating the SuperNet. Several neural networks with different model sizes and performances will be produced after the continuous search with only 0.4 GPU days. As a result, our framework provides a series of networks with the number of parameters ranging from 3.7M to 5.1M under mobile settings. These networks surpass those produced by the state-of-the-art methods on the benchmark ImageNet dataset.

研究动机与目标

为解决现有进化 NAS 方法计算成本过高的问题，这些方法需要对每个架构独立训练。
通过将进化搜索与参数共享相结合，克服强化学习与可微 NAS 的低效性。
在多种模型尺寸与延迟约束下，生成多样化且高性能的架构。
开发一种连续进化策略，复用前期代际的知识以加速收敛。
通过将延迟作为多目标优化标准，实现设备感知的 NAS。

提出的方法

初始化包含多个单元和模块的 SuperNet，所有候选架构共享参数。
应用进化操作——交叉与变异——从当前种群生成新架构。
通过非支配排序（pNSGA-III）在准确率与模型大小之间选择高质量架构，同时在帕累托前沿保持多样性。
仅使用所选帕累托最优架构更新 SuperNet，确保代际间知识保留。
引入保护机制，防止模型在进化过程中陷入次优的小型架构。
搜索过程为连续迭代：SuperNet 与种群被持续传递，最大限度减少重训练，提升效率。

实验结果

研究问题

RQ1与现有方法相比，连续进化 NAS 框架是否能降低搜索成本，同时保持或提升架构质量？
RQ2通过 SuperNet 实现的参数共享在加速进化 NAS 的同时，是否能避免多样性或性能的损失？
RQ3结合非支配排序（pNSGA-III）的多目标优化能否在模型大小与准确率之间生成多样化的帕累托前沿架构？
RQ4在搜索过程中引入设备感知延迟，如何影响最终模型的性能与部署效率？
RQ5所提方法能否生成可迁移的架构，在移动设备约束下于 ImageNet 上表现出良好泛化能力？

主要发现

CARS 在仅 3.7M 参数下实现了 ImageNet 上 75.2% 的 SOTA Top-1 准确率，在相似条件下超越 PNAS 与 DARTS。
该方法生成的一系列模型参数量在 3.7M 到 5.1M 之间，FLOPs 范围为 430 至 590 MFLOPs，全部在 0.4 GPU 天内完成。
CARS-I 在 5.1M 参数与 591 MFLOPs 下实现 75.2% 的 Top-1 准确率，与相同参数量下的 PNAS 相比提升 1%。
CARS-G 在 4.7M 参数下实现 74.2% 的 Top-1 准确率，较相同参数量下的 DARTS 提升 0.9%。
CARS 模型在 HUAWEI P30 Pro 上的延迟范围为 82.9ms 至 100.6ms，其中 CARS-I 在相近延迟下准确率高于 NASNet 与 AmoebaNet 变体。
通过复用 SuperNet 的连续进化策略将搜索成本降低至 0.4 GPU 天，显著低于 AmoebaNet 的 3150 GPU 天。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。