Skip to main content
QUICK REVIEW

[論文レビュー] Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search

Arber Zela, Aaron Klein|arXiv (Cornell University)|Jul 18, 2018
Machine Learning and Data Classification参考文献 21被引用数 88
ひとこと要約

本論文は BOHB を用いてニューラルアーキテクチャとハイパーパラメータを逐次的に予算を増やしながら共同最適化することを提案し、3時間制限内で CIFAR-10 の競争力のある結果を示し、アーキテクチャとハイパーパラメータ間の予算依存の相互作用を明らかにする。

ABSTRACT

While existing work on neural architecture search (NAS) tunes hyperparameters in a separate post-processing step, we demonstrate that architectural choices and other hyperparameter settings interact in a way that can render this separation suboptimal. Likewise, we demonstrate that the common practice of using very few epochs during the main NAS and much larger numbers of epochs during a post-processing step is inefficient due to little correlation in the relative rankings for these two training regimes. To combat both of these problems, we propose to use a recent combination of Bayesian optimization and Hyperband for efficient joint neural architecture and hyperparameter search.

研究の動機と目的

  • Motivate joint optimization of architecture and hyperparameters rather than post-hoc tuning.
  • Show that short training budgets may not correlate well with long-budget performance.
  • Demonstrate an anytime, budget-aware AutoML approach that gradually increases resources.
  • Evaluate joint NAS-HP search on CIFAR-10 under a 3-hour constraint.

提案手法

  • Cast neural architecture search as a hyperparameter optimization problem with categorical and conditional hyperparameters.
  • Adopt BOHB, a combination of Bayesian optimization and Hyperband, for efficient multi-budget search.
  • Define a joint search space with 10 architectural choices and 7 hyperparameters for a multi-branch residual architecture.
  • Use successive halving to allocate more compute to promising configurations across budgets.
  • Train and evaluate configurations under multiple budgets (e.g., 400s, 1200s, 1h, 3h) to capture budget-aware performance.
  • Compare to manually constructed architectures and analyze budget correlations and parameter importance.

実験結果

リサーチクエスチョン

  • RQ1Can neural architecture search be effectively performed jointly with hyperparameter optimization?
  • RQ2How do short and long training budgets correlate in ranking configurations, and what budget should be used during optimization?
  • RQ3Is the BOHB approach effective under a strict time budget for CIFAR-10?
  • RQ4What architectural and hyperparameter choices are most influential under limited compute budgets?

主な発見

NetworkParamsTest error (%)
ResNet-1811.2M3.34±0.11
Shake-Shake 26 2x32d2.9M3.91±0.09
Shake-Shake 26 2x64d11.7M3.38±0.07
Shake-Shake 26 2x96d26.2M4.22±0.06
Ours27.6M3.18±0.16
  • Joint architecture and hyperparameter search with BOHB yields competitive CIFAR-10 results within a 3-hour budget (test error 3.18%).
  • The best performing architecture under 3h is a medium-sized multi-branch residual network (26 2x64d).
  • Spearman correlations show strong alignment between adjacent budgets but degrade quickly across larger budget gaps, making short-budget rankings unreliable for long-budget selection.
  • Budget-aware analysis (fANOVA) indicates different hyperparameters and architectural choices gain or lose importance as the budget changes, highlighting interaction effects.
  • BOHB-based search outperformed several standard architectures under the same optimization pipeline and budget, demonstrating the value of joint optimization.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。