QUICK REVIEW

[論文レビュー] Tune: A Research Platform for Distributed Model Selection and Training

Richard Liaw, Eric Liang|arXiv (Cornell University)|Jul 13, 2018

Machine Learning and Data Classification参考文献 11被引用数 615

ひとこと要約

Tune は distributed model selection と training のための統一されたオープンソース API とスケジューリングフレームワークを提供し、Ray の上にさまざまなハイパーパラメータ探索アルゴリズムを容易に組み込めるようにする。トレーニングスクリプトと探索ロジックを切り離し、クラスタ全体で実験をスケールさせる。

ABSTRACT

Modern machine learning algorithms are increasingly computationally demanding, requiring specialized hardware and distributed computation to achieve high performance in a reasonable time frame. Many hyperparameter search algorithms have been proposed for improving the efficiency of model selection, however their adaptation to the distributed compute environment is often ad-hoc. We propose Tune, a unified framework for model selection and training that provides a narrow-waist interface between training scripts and search algorithms. We show that this interface meets the requirements for a broad range of hyperparameter search algorithms, allows straightforward scaling of search to large clusters, and simplifies algorithm implementation. We demonstrate the implementation of several state-of-the-art hyperparameter search algorithms in Tune. Tune is available at http://ray.readthedocs.io/en/latest/tune.html.

研究の動機と目的

拡張可能で再現性のある分散モデル選択とトレーニングの必要性を動機づける。
Tune をトレーニングスクリプトとハイパーパラメータ探索アルゴリズムの狭いボトルネック API として紹介する。
Tune が幅広い探索戦略とフレームワーク横断の容易な統合を可能にすることを示す。

提案手法

ユーザーAPIをトレーニングスクリプト用と探索アルゴリズム用のスケジューリングAPIの二重設計を提案する。
協調制御モデルまたは直接トライアル制御を実装してトレーニング中にTuneと相互作用する。
並行トライアルを管理する TrialScheduler インターフェースを提供し、on_result と choose_trial_to_run を用意する。
分散実行、リソース管理、トライアル間のデータ処理を扱うため Ray フレームワーク上に構築する。
HyperBand 系、Median Stopping Rule、HyperOpt、Population-Based Training など、複数の最先端ハイパーパラメータ探索アルゴリズムを実装または統合する。
初期トライアル構成を定義するための最小限の例とドメイン特化DSLを提供する。

実験結果

リサーチクエスチョン

RQ1Tune は単一の一般的なAPIで幅広いハイパーパラメータ最適化アルゴリズムをサポートできるか？
RQ2Ray ベースの実装は多数の同時トライアルのスケーラブルで分散実行を可能にするか？
RQ3中間のトライアル結果を効果的に使用して動的なスケジューリング決定とハイパーパラメータの適応を行えるか？
RQ4ユーザー体験は既存のトレーニングスクリプトへの統合を容易にしつつ再現性を保つのに十分か？
RQ5Tune は異なるスケジューラ間での AutoML 実験の再現性、可視化、および比較をどのように促進するか？

主な発見

Tune は狭いボトルネックのユーザーAPIとスケジューリングAPIを提供し、さまざまなハイパーパラメータ探索アルゴリズムの容易な統合を可能にする。
このフレームワークは不規則で異質なトライアルワークロードや中間結果主導のスケジューリング決定をサポートする。
非同期 HyperBand、HyperBand、Median Stopping Rule、HyperOpt、Population-Based Training など、複数のアルゴリズムが Tune で実装または統合されている。
Trials は Ray のタスク/アクターとして実行され、リソース管理とデータ処理は Ray を介して行われ、ネストされた分散計算を可能にする。
Tune はグリッド/検索設定の最小例と DSL を提供し、コンソールと TensorBoard の統合で進捗をログする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。