QUICK REVIEW

[論文レビュー] Robust Training under Label Noise by Over-parameterization

Sheng Liu, Zhihui Zhu|arXiv (Cornell University)|Feb 28, 2022

Machine Learning and Data Classification被引用数 29

ひとこと要約

過剰パラメータ化された分類器において、疎なラベルノイズをクリーンデータから分離する Sparse Over-Parameterization (SOP) を導入し、理論的および実証的な裏付けを示して、破損したラベルに対する頑健性の向上を示す。

ABSTRACT

Recently, over-parameterized deep networks, with increasingly more network parameters than training samples, have dominated the performances of modern machine learning. However, when the training data is corrupted, it has been well-known that over-parameterized networks tend to overfit and do not generalize. In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted. The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data. Specifically, we model the label noise via another sparse over-parameterization term, and exploit implicit algorithmic regularizations to recover and separate the underlying corruptions. Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets. Furthermore, our experimental results are corroborated by theory on simplified linear models, showing that exact separation between sparse noise and low-rank data can be achieved under incoherent conditions. The work opens many interesting directions for improving over-parameterized models by using sparse over-parameterization and implicit regularization.

研究の動機と目的

過剰パラメータ化された深層ネットワークで訓練ラベルが破損している場合の頑健な学習を動機付ける。
訓練中にデータから疎なラベルノイズを分離する実用的なアルゴリズムを提案する。
単純化した線形モデルの下での厳密な分離を示す理論的洞察を提供する。
合成データと実データの両方でラベルノイズに対する経験的頑健性を示す。

提案手法

未知のラベルノイズを補助的な疎な項 s_i で分解し、s_i = u_i ⊙ u_i − v_i ⊙ v_i と表現する。
θ（ネットワークパラメータ）と補助変数 {u_i, v_i} に対して結合目的関数を最適化し、y_i ≈ f(x_i; θ) + s_i を適合させる。
(u_i, v_i) と θ に対して異なる学習率 ατ と τ を用いた勾配降下法を用い、暗黙的正則化を誘導する。
これにより sparse ノイズ s_i にℓ1ペナルティを誘導し、ロバストな疎モデル化と結び付く。
u_i, v_i の制約を課す適切な射影を含むクロスエントロピー損失とMSE損失を用いた実装のバリエーションを提供する。
単純化された過剰パラメータ化線形モデルに関する理論分析は、非相関性・低ランク条件の下でノイズとデータを正確に分離することを示す。

実験結果

リサーチクエスチョン

RQ1過剰パラメータ化されたモデルは、一部のラベルが破損している場合でも頑健に訓練できるか。
RQ2補助的な疎過剰パラメータ化項は、訓練中にラベルノイズをクリーンデータから分離できるようにするか。
RQ3補助変数に対する提案された勾配ダイナミクスからどのような暗黙の正則化効果が生じるか。
RQ4単純化された線形モデルに関する理論的結果は、SOP で観察される経験的頑健性を説明するか。

主な発見

SOP は誤った訓練ラベルへの過適合を防ぎ、ラベルノイズ下での複数のデータセットでテスト精度を向上させる。
SOP+ は一貫性とクラスバランスの正則化を組み込むことで性能をさらに改善する。
実証結果は、CIFAR-10/100 の合成および現実的なラベルノイズ、および Clothing-1M と WebVision で、SOP および SOP+ がいくつかのベースラインより上回ることを示している。
単純化された線形モデルの理論分析は、勾配ダイナミクスがグラウンドトゥルースのパラメータを、ノイズの疎な破損とデータの非相関性・低ランク仮定の下で回復し、ノイズに対して ℓ1 正則化効果を持つことを示す。）

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。