[論文レビュー] Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search
本論文は BOHB を用いてニューラルアーキテクチャとハイパーパラメータを逐次的に予算を増やしながら共同最適化することを提案し、3時間制限内で CIFAR-10 の競争力のある結果を示し、アーキテクチャとハイパーパラメータ間の予算依存の相互作用を明らかにする。
While existing work on neural architecture search (NAS) tunes hyperparameters in a separate post-processing step, we demonstrate that architectural choices and other hyperparameter settings interact in a way that can render this separation suboptimal. Likewise, we demonstrate that the common practice of using very few epochs during the main NAS and much larger numbers of epochs during a post-processing step is inefficient due to little correlation in the relative rankings for these two training regimes. To combat both of these problems, we propose to use a recent combination of Bayesian optimization and Hyperband for efficient joint neural architecture and hyperparameter search.
研究の動機と目的
- Motivate joint optimization of architecture and hyperparameters rather than post-hoc tuning.
- Show that short training budgets may not correlate well with long-budget performance.
- Demonstrate an anytime, budget-aware AutoML approach that gradually increases resources.
- Evaluate joint NAS-HP search on CIFAR-10 under a 3-hour constraint.
提案手法
- Cast neural architecture search as a hyperparameter optimization problem with categorical and conditional hyperparameters.
- Adopt BOHB, a combination of Bayesian optimization and Hyperband, for efficient multi-budget search.
- Define a joint search space with 10 architectural choices and 7 hyperparameters for a multi-branch residual architecture.
- Use successive halving to allocate more compute to promising configurations across budgets.
- Train and evaluate configurations under multiple budgets (e.g., 400s, 1200s, 1h, 3h) to capture budget-aware performance.
- Compare to manually constructed architectures and analyze budget correlations and parameter importance.
実験結果
リサーチクエスチョン
- RQ1Can neural architecture search be effectively performed jointly with hyperparameter optimization?
- RQ2How do short and long training budgets correlate in ranking configurations, and what budget should be used during optimization?
- RQ3Is the BOHB approach effective under a strict time budget for CIFAR-10?
- RQ4What architectural and hyperparameter choices are most influential under limited compute budgets?
主な発見
| Network | Params | Test error (%) |
|---|---|---|
| ResNet-18 | 11.2M | 3.34±0.11 |
| Shake-Shake 26 2x32d | 2.9M | 3.91±0.09 |
| Shake-Shake 26 2x64d | 11.7M | 3.38±0.07 |
| Shake-Shake 26 2x96d | 26.2M | 4.22±0.06 |
| Ours | 27.6M | 3.18±0.16 |
- Joint architecture and hyperparameter search with BOHB yields competitive CIFAR-10 results within a 3-hour budget (test error 3.18%).
- The best performing architecture under 3h is a medium-sized multi-branch residual network (26 2x64d).
- Spearman correlations show strong alignment between adjacent budgets but degrade quickly across larger budget gaps, making short-budget rankings unreliable for long-budget selection.
- Budget-aware analysis (fANOVA) indicates different hyperparameters and architectural choices gain or lose importance as the budget changes, highlighting interaction effects.
- BOHB-based search outperformed several standard architectures under the same optimization pipeline and budget, demonstrating the value of joint optimization.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。