QUICK REVIEW

[论文解读] One-Shot Neural Architecture Search Through A Posteriori Distribution Guided Sampling.

Yizhou Zhou, Xiaoyan Sun|arXiv (Cornell University)|Jun 23, 2019

Domain Adaptation and Few-Shot Learning被引用 3

一句话总结

本文提出了一种一次性神经架构搜索方法，通过基于对架构和权重的估计后验联合分布来引导子网络采样，从而提高效率和准确性。通过使用变分推断和混合网络表示来建模该分布，该方法将子网络采样次数降低了数个数量级，同时在CIFAR-10、CIFAR-100和ImageNet上实现了最先进性能，搜索速度比以往方法快20倍且准确率更高。

ABSTRACT

The emergence of one-shot approaches has greatly advanced the research on neural architecture search (NAS). Recent approaches train an over-parameterized super-network (one-shot model) and then sample and evaluate a number of sub-networks, which inherit weights from the one-shot model. The overall searching cost is significantly reduced as training is avoided for sub-networks. However, the network sampling process is casually treated and the inherited weights from an independently trained super-network perform sub-optimally for sub-networks. In this paper, we propose a novel one-shot NAS scheme to address the above issues. The key innovation is to explicitly estimate the joint a posteriori distribution over network architecture and weights, and sample networks for evaluation according to it. This brings two benefits. First, network sampling under the guidance of a posteriori probability is more efficient than conventional random or uniform sampling. Second, the network architecture and its weights are sampled as a pair to alleviate the sub-optimal weights problem. Note that estimating the joint a posteriori distribution is not a trivial problem. By adopting variational methods and introducing a hybrid network representation, we convert the distribution approximation problem into an end-to-end neural network training problem which is neatly approached by variational dropout. As a result, the proposed method reduces the number of sampled sub-networks by orders of magnitude. We validate our method on the fundamental image classification task. Results on Cifar-10, Cifar-100 and ImageNet show that our method strikes the best trade-off between precision and speed among NAS methods. On Cifar-10, we speed up the searching process by 20x and achieve a higher precision than the best network found by existing NAS methods.

研究动机与目标

解决传统一次性NAS方法中采用随机或均匀采样子网络所导致的低效性与次优性能问题。
缓解从超网络继承的权重对单个子网络而言次优的问题。
开发一种基于学习到的后验分布联合采样架构与权重的方法，以提升搜索效率和准确性。
在保持或提升最终模型性能的前提下，减少神经架构搜索中所需的子网络评估次数。

提出的方法

该方法使用变分推断来估计神经架构与权重之间的联合后验分布。
引入一种混合网络表示，以实现对联合分布的有效参数化。
将分布近似问题重新表述为使用变分dropout的端到端训练问题。
根据估计的后验概率采样子网络，确保在搜索过程中架构与权重协同优化。
整个过程以单一端到端方式训练，实现高效且可微分的搜索。

实验结果

研究问题

RQ1架构与权重的后验分布是否能提升一次性NAS中的采样效率？
RQ2与随机或均匀采样相比，后验引导采样在搜索效率和准确率方面表现如何？
RQ3联合采样架构与权重能否缓解从超网络继承权重的次优性问题？
RQ4在保持或提升搜索性能的前提下，子网络评估次数最多可降低多少？

主要发现

与传统一次性NAS相比，所提方法将子网络采样次数降低了数个数量级。
在CIFAR-10上，该方法实现了比现有NAS方法找到的最佳网络更高的top-1准确率。
在CIFAR-10上，搜索过程加速了20倍，同时保持了更优的性能。
该方法在CIFAR-10、CIFAR-100和ImageNet上均实现了搜索速度与准确率之间的最先进权衡。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。