QUICK REVIEW

[論文レビュー] The Theory Behind Overfitting, Cross Validation, Regularization, Bagging, and Boosting: Tutorial

Benyamin Ghojogh, Mark Crowley|arXiv (Cornell University)|May 28, 2019

Domain Adaptation and Few-Shot Learning被引用数 149

ひとこと要約

このチュートリアルは、回帰と分類を横断する過学習、クロスバリデーション、正則化、バギング、ブースティングの理論を、SUREおよびバイアス‐分散分析を用いて説明します。

ABSTRACT

In this tutorial paper, we first define mean squared error, variance, covariance, and bias of both random variables and classification/predictor models. Then, we formulate the true and generalization errors of the model for both training and validation/test instances where we make use of the Stein's Unbiased Risk Estimator (SURE). We define overfitting, underfitting, and generalization using the obtained true and generalization errors. We introduce cross validation and two well-known examples which are $K$-fold and leave-one-out cross validations. We briefly introduce generalized cross validation and then move on to regularization where we use the SURE again. We work on both $\\ell_2$ and $\\ell_1$ norm regularizations. Then, we show that bootstrap aggregating (bagging) reduces the variance of estimation. Boosting, specifically AdaBoost, is introduced and it is explained as both an additive model and a maximum margin model, i.e., Support Vector Machine (SVM). The upper bound on the generalization error of boosting is also provided to show why boosting prevents from overfitting. As examples of regularization, the theory of ridge and lasso regressions, weight decay, noise injection to input/weights, and early stopping are explained. Random forest, dropout, histogram of oriented gradients, and single shot multi-box detector are explained as examples of bagging in machine learning and computer vision. Finally, boosting tree and SVM models are mentioned as examples of boosting.

研究の動機と目的

乱数変量とモデルに対して、平均二乗誤差（mean squared error）、分散、共分散、およびバイアスを定義する。
Steinの不偏リスク推定量（SURE）を用いて真の誤差と汎化誤差を区別する。
クロスバリデーション（K分割と Leave-One-Out）および一般化クロスバリデーションを紹介し、正則化について論じる。
正則化（リッジとラッソ）とモデルの複雑さへの影響を説明する。
バギングとブースティング（ AdaBoost を含む）を説明し、ブースティングをSVM/最大マージンの概念と関連づける。
MLとコンピュータビジョンにおける正則化とアンサンブル法の例を示す。

提案手法

SUREを用いて、トレーニングおよび検証/テストデータの真の誤差と汎化誤差を定式化する。
推定量（アンサンブルモデルを含む）のバイアス、分散、MSEを導出・関連づける。
K分割およびLOOCVの手順を、トレーニング/テスト分割の定義とチートの警告とともに提示する。
ℓ2およびℓ1ノルムのためのSUREによる一般化クロスバリデーションと正則化を紹介する。
バギングによる分散削減を説明し、ブースティングを加法モデルおよびSVMの概念と結びつける。
実用的な正則化技術（ridge、lasso、weight decay、early stopping、noise injection）とアンサンブル法（random forests、dropout、histogram of oriented gradients、single shot multi-box detector）を議論する。

実験結果

リサーチクエスチョン

RQ1トレーニングデータとテストデータの真の誤差と汎化誤差は何ですか、そしてSUREを用いてそれらを無偏に推定するにはどうすればよいですか。
RQ2回帰および分類の設定において、バイアス、分散、MSEはどのように関連しますか、アンサンブル法を含めて。
RQ3クロスバリデーション戦略（K分割、LOOCV）は過学習を防ぎ、モデルの複雑さをどう選択するのに役立ちますか。
RQ4正則化、バギング、ブースティングはモデルの複雑さと汎化を制御する上でどのような役割を果たしますか。
RQ5ブースティングがなぜ過学習を防ぐことができるかを説明する実用的な解釈と境界は何ですか。

主な発見

乱数変量とモデルの平均二乗誤差、分散、バイアスの定義とそれらの関係が提示されます。
SUREはトレーニング誤差と真の誤差を結びつける枠組みを提供し、過学習と正則化の効果の分析を可能にします。
K-foldおよびleave-one-outクロスバリデーションを、データ分割の指針とチートの潜在的なシナリオとともに正式化します。
バギングは推定量の分散を低減することが示され、例としてrandom forests、dropout、ML/CVのCV技術を挙げます。
ブースティングは、加法モデルとしておよび最大マージン（SVM風）アプローチとして議論され、汎化誤差の上限を示して過学習に対する頑健性を正当化します。
正則化（ridge、lasso、weight decay、noise injection、early stopping）は、同じバイアス-分散フレームワーク内で検討されます。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。