Skip to main content
QUICK REVIEW

[论文解读] One-Shot Neural Architecture Search Through A Posteriori Distribution Guided Sampling.

Yizhou Zhou, Xiaoyan Sun|arXiv (Cornell University)|Jun 23, 2019
Domain Adaptation and Few-Shot Learning被引用 3
一句话总结

本文提出了一种一次性神经架构搜索方法,通过基于对架构和权重的估计后验联合分布来引导子网络采样,从而提高效率和准确性。通过使用变分推断和混合网络表示来建模该分布,该方法将子网络采样次数降低了数个数量级,同时在CIFAR-10、CIFAR-100和ImageNet上实现了最先进性能,搜索速度比以往方法快20倍且准确率更高。

ABSTRACT

The emergence of one-shot approaches has greatly advanced the research on neural architecture search (NAS). Recent approaches train an over-parameterized super-network (one-shot model) and then sample and evaluate a number of sub-networks, which inherit weights from the one-shot model. The overall searching cost is significantly reduced as training is avoided for sub-networks. However, the network sampling process is casually treated and the inherited weights from an independently trained super-network perform sub-optimally for sub-networks. In this paper, we propose a novel one-shot NAS scheme to address the above issues. The key innovation is to explicitly estimate the joint a posteriori distribution over network architecture and weights, and sample networks for evaluation according to it. This brings two benefits. First, network sampling under the guidance of a posteriori probability is more efficient than conventional random or uniform sampling. Second, the network architecture and its weights are sampled as a pair to alleviate the sub-optimal weights problem. Note that estimating the joint a posteriori distribution is not a trivial problem. By adopting variational methods and introducing a hybrid network representation, we convert the distribution approximation problem into an end-to-end neural network training problem which is neatly approached by variational dropout. As a result, the proposed method reduces the number of sampled sub-networks by orders of magnitude. We validate our method on the fundamental image classification task. Results on Cifar-10, Cifar-100 and ImageNet show that our method strikes the best trade-off between precision and speed among NAS methods. On Cifar-10, we speed up the searching process by 20x and achieve a higher precision than the best network found by existing NAS methods.

研究动机与目标

  • 解决传统一次性NAS方法中采用随机或均匀采样子网络所导致的低效性与次优性能问题。
  • 缓解从超网络继承的权重对单个子网络而言次优的问题。
  • 开发一种基于学习到的后验分布联合采样架构与权重的方法,以提升搜索效率和准确性。
  • 在保持或提升最终模型性能的前提下,减少神经架构搜索中所需的子网络评估次数。

提出的方法

  • 该方法使用变分推断来估计神经架构与权重之间的联合后验分布。
  • 引入一种混合网络表示,以实现对联合分布的有效参数化。
  • 将分布近似问题重新表述为使用变分dropout的端到端训练问题。
  • 根据估计的后验概率采样子网络,确保在搜索过程中架构与权重协同优化。
  • 整个过程以单一端到端方式训练,实现高效且可微分的搜索。

实验结果

研究问题

  • RQ1架构与权重的后验分布是否能提升一次性NAS中的采样效率?
  • RQ2与随机或均匀采样相比,后验引导采样在搜索效率和准确率方面表现如何?
  • RQ3联合采样架构与权重能否缓解从超网络继承权重的次优性问题?
  • RQ4在保持或提升搜索性能的前提下,子网络评估次数最多可降低多少?

主要发现

  • 与传统一次性NAS相比,所提方法将子网络采样次数降低了数个数量级。
  • 在CIFAR-10上,该方法实现了比现有NAS方法找到的最佳网络更高的top-1准确率。
  • 在CIFAR-10上,搜索过程加速了20倍,同时保持了更优的性能。
  • 该方法在CIFAR-10、CIFAR-100和ImageNet上均实现了搜索速度与准确率之间的最先进权衡。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。