Skip to main content
QUICK REVIEW

[论文解读] Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS

Shi Han, Renjie Pi|arXiv (Cornell University)|Nov 21, 2019
Machine Learning and Data Classification参考文献 51被引用 53
一句话总结

tldr: BONAS 将 Graph Convolutional Network embeddings 与 Bayesian sigmoid regression 相结合,以引导基于贝叶斯优化的搜索,同时使用权重共享来高效评估一批有前景的架构,从而提升基于样本的 NAS 的可靠性和速度。

ABSTRACT

Neural Architecture Search (NAS) has shown great potentials in finding better neural network designs. Sample-based NAS is the most reliable approach which aims at exploring the search space and evaluating the most promising architectures. However, it is computationally very costly. As a remedy, the one-shot approach has emerged as a popular technique for accelerating NAS using weight-sharing. However, due to the weight-sharing of vastly different networks, the one-shot approach is less reliable than the sample-based approach. In this work, we propose BONAS (Bayesian Optimized Neural Architecture Search), a sample-based NAS framework which is accelerated using weight-sharing to evaluate multiple related architectures simultaneously. Specifically, we apply Graph Convolutional Network predictor as a surrogate model for Bayesian Optimization to select multiple related candidate models in each iteration. We then apply weight-sharing to train multiple candidate models simultaneously. This approach not only accelerates the traditional sample-based approach significantly, but also keeps its reliability. This is because weight-sharing among related architectures are more reliable than those in the one-shot approach. Extensive experiments are conducted to verify the effectiveness of our method over many competing algorithms.

研究动机与目标

  • Motivate efficient and reliable neural architecture search (NAS) by improving sample-based NAS efficiency.
  • Develop a surrogate model that naturally handles graph-structured architectures without handcrafted kernels.
  • Accelerate evaluation by weight-sharing a small subset of high-potential architectures.
  • Demonstrate BONAS gains across closed-domain NAS benchmarks and open-domain search spaces.
  • Show transferability and robustness of BONAS across architectures and datasets.

提出的方法

  • Encode neural architectures as graphs and derive global graph embeddings via a Graph Convolutional Network (GCN).
  • Replace Gaussian process surrogates with a Bayesian sigmoid regression (BSR) over GCN embeddings to obtain predictive mean and variance for Bayesian optimization (BO).
  • Use an exponentially weighted loss to train the surrogate, emphasizing high-accuracy architectures.
  • In the query phase, form a small super-network by weight-sharing a batch of top-k BO-selected architectures and train them together, reinitializing weights to ensure fair evaluation.
  • Select candidates from a pool using UCB acquisition with mean/variance provided by the GCN+BSR surrogate.
  • Iteratively update the surrogate with newly evaluated architectures and refine embeddings.

实验结果

研究问题

  • RQ1Can a graph-based embedding plus Bayesian surrogate improve BO-based NAS performance without handcrafted kernels?
  • RQ2Does weight-sharing over a small, high-potential subset of architectures yield reliable and faster evaluations than full training or large-scale weight-sharing?
  • RQ3How does BONAS perform on standard NAS benchmarks (NAS-Bench-101/ NAS-Bench-201) and open-domain search spaces (e.g., NASNet) compared to state-of-the-art methods?
  • RQ4Is BONAS transferable to other model families (e.g., LSTM cells) and robust across embedding sizes?

主要发现

  • GCN-based predictors achieve higher correlation with true architecture performance than MLP/LSTM/meta-NN baselines on NAS-Bench-101/201 and LSTM-12K.
  • BONAS consistently outperforms competing baselines in closed-domain NAS benchmarks.
  • In open-domain NAS (NASNet search space), BONAS achieves competitive top-1 error on CIFAR-10 while requiring substantially fewer GPU days than some baselines.
  • BONAS enables exploring and evaluating thousands of architectures efficiently via small-batch weight-sharing (k around 100) in the super-network phase.
  • Transferring BONAS-discovered architectures from CIFAR-10 to ImageNet yields competitive results, with BONAS-derived cells achieving strong top-1/top-5 metrics under mobile constraints.
  • Ablations show the GCN+BSR surrogate with weighted loss and weight-sharing query phase are beneficial to performance and efficiency.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。