QUICK REVIEW

[論文レビュー] Combination of Hyperband and Bayesian Optimization for Hyperparameter Optimization in Deep Learning

Jiazhuo Wang, Jason Xu|arXiv (Cornell University)|Jan 5, 2018

Machine Learning and Data Classification参考文献 5被引用数 57

ひとこと要約

本論文は Hyperband と Bayesian optimization を組み合わせて、ハイパーパラメータ探索における履歴情報を活用することで、Hyperband、Bayesian optimization、あるいはランダム探索のみの場合よりも、深層学習におけるより良く、より速いハイパーパラメータ設定を実現することを提案している。

ABSTRACT

Deep learning has achieved impressive results on many problems. However, it requires high degree of expertise or a lot of experience to tune well the hyperparameters, and such manual tuning process is likely to be biased. Moreover, it is not practical to try out as many different hyperparameter configurations in deep learning as in other machine learning scenarios, because evaluating each single hyperparameter configuration in deep learning would mean training a deep neural network, which usually takes quite long time. Hyperband algorithm achieves state-of-the-art performance on various hyperparameter optimization problems in the field of deep learning. However, Hyperband algorithm does not utilize history information of previous explored hyperparameter configurations, thus the solution found is suboptimal. We propose to combine Hyperband algorithm with Bayesian optimization (which does not ignore history when sampling next trial configuration). Experimental results show that our combination approach is superior to other hyperparameter optimization approaches including Hyperband algorithm.

研究の動機と目的

高い複雑性とトレーニングコストにより、深層学習における系統的なハイパーパラメータ調整の必要性を動機づける。
リソースを効率的に割り当てつつ、履歴情報を活用してハイパーパラメータのサンプリングを導く手法を提案する。
組み合わせ手法が、さまざまなDLタスクにおいて既存のハイパーパラメータ最適化手法より優れていることを示す。

提案手法

HyperbandとBayesian optimizationをレビューし、それぞれの長所と短所を整理する。
Hyperbandに従いつつ、試行点を順次サンプリングするアルゴリズムを提案する。サンプリングにはBayesian optimizationの基準を用いる。
ベイジアン・サロゲートモデル（TPE）を用いて次の試行点の選択を導き、途中の結果で更新する。
各Hyperbandラウンドで試行点を1つずつサンプルし、評価ごとにサロゲートモデルを更新して、活用と探索のバランスを取る。
LeNetとAlexNetの実験およびSSD分解タスクで手法を評価し、Random search、TPE、Hyperbandと比較する。

実験結果

リサーチクエスチョン

RQ1Hyperbandは前の試行からの履歴をBayesian optimizationを通じて取り込むことで改善可能か？
RQ2組み合わせ手法のHyperband+Bayesian optimizationは、データセットやモデルの複雑さを超えて、ベースライン手法よりも迅速により良いハイパーパラメータ設定を見つけられるか？
RQ3ハイパーパラメータ問題の難易度が上がる（より深いネットワーク、より大きな探索空間）につれて手法の性能はどうなるか？

主な発見

Hyperband_TPEは、複数のDLタスクにおいて、Random search、TPE、Hyperbandを一貫して上回る。
ハイパーパラメータ最適化問題が難しくなるにつれて、Hyperband_TPEとベースラインの性能差が拡大する。
容易な問題ではすべての手法が速く収束するが、難しい問題では組み合わせ手法がより大きな利点を示す。
SSD分解実験では、Hyperband_TPEがベースラインよりも良い目的値（mapとfps）を再び示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。