QUICK REVIEW

[论文解读] Single Path One-Shot Neural Architecture Search with Uniform Sampling

Zichao Guo, Xiangyu Zhang|arXiv (Cornell University)|Mar 31, 2019

Advanced Neural Network Applications参考文献 58被引用 246

一句话总结

本文提出了一种单路径一次性 NAS，采用均匀路径采样来训练随机超网，从而实现高效、灵活的架构搜索，支持大规模搜索空间和现实世界约束，在 ImageNet 上达到最先进的结果。

ABSTRACT

We revisit the one-shot Neural Architecture Search (NAS) paradigm and analyze its advantages over existing NAS approaches. Existing one-shot method, however, is hard to train and not yet effective on large scale datasets like ImageNet. This work propose a Single Path One-Shot model to address the challenge in the training. Our central idea is to construct a simplified supernet, where all architectures are single paths so that weight co-adaption problem is alleviated. Training is performed by uniform path sampling. All architectures (and their weights) are trained fully and equally. Comprehensive experiments verify that our approach is flexible and effective. It is easy to train and fast to search. It effortlessly supports complex search spaces (e.g., building blocks, channel, mixed-precision quantization) and different search constraints (e.g., FLOPs, latency). It is thus convenient to use for various needs. It achieves start-of-the-art performance on the large dataset ImageNet.

研究动机与目标

通过解决训练不稳定性和权重耦合问题，重新激发对一次性 NAS 范式的兴趣。
提出一种单路径超网，结合均匀路径采样，以将架构搜索与权重优化解耦。
展示一个灵活的搜索框架，支持复杂设计选择（通道数、混合精度量化）和现实世界约束（FLOPs、延迟）。
在 ImageNet 上展现最先进的性能，在准确度、内存和搜索效率方面。

提出的方法

给出一个单路径超网的形式化定义，其中每个架构对应一条路径，以降低权重共适应。
使用均匀路径采样策略对超网进行训练，使每个架构得到充分且等量的训练。
在严格的延迟/FLOPs 约束下使用进化算法进行架构搜索。
引入用于通道数搜索和混合精度量化搜索的新颖选择模块。
将均匀路径采样与路径丢弃进行比较，并展示更高的稳定性和性能。

Figure 1: Comparison of single path strategy and drop path strategy

实验结果

研究问题

RQ1单路径、均匀采样的超网是否能够在不进行微调的情况下有效预测架构性能？
RQ2与路径丢弃相比，均匀采样是否缓解权重耦合和训练不稳定性？
RQ3该方法在大规模数据集上如何处理复杂搜索空间（通道数、量化）以及现实世界约束（延迟、FLOPs）？
RQ4在大空间中选择架构时，进化搜索是否比随机搜索更有效？
RQ5相对于以往的 NAS 方法，在 ImageNet 上，该方法的效率和内存占用对比如何？

主要发现

采用均匀路径采样训练的单路径超网易于训练，搜索速度快。
该方法支持包括通道数和混合精度量化在内的丰富搜索空间。
进化架构搜索在在受约束下寻找高性能架构方面优于随机搜索。
在 ImageNet 上，该方法在满足延迟/FLOPs 约束的同时获得较高准确度，并且在训练阶段比一些先前方法需要更少的内存。
该方法能够从同一个训练好的超网进行多种基于约束的搜索，展示出灵活性和高效性。

Figure 2: Evolutionary vs. random architecture search

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。