QUICK REVIEW

[論文レビュー] Open-set Label Noise Can Improve Robustness Against Inherent Label Noise

Hongxin Wei, Lue Tao|arXiv (Cornell University)|Jun 21, 2021

Machine Learning and Data Classification参考文献 77被引用数 30

ひとこと要約

この論文は、open-set label noise が inherent label noise に対して頑健性を低下させず、あるいは有益となり得ることを示し、モデル容量を消費して一般化とOOD検出を改善する Open-set Regularization with Dynamic Noisy Labels (ODNL) を提案します。

ABSTRACT

Learning with noisy labels is a practically challenging problem in weakly supervised learning. In the existing literature, open-set noises are always considered to be poisonous for generalization, similar to closed-set noises. In this paper, we empirically show that open-set noisy labels can be non-toxic and even benefit the robustness against inherent noisy labels. Inspired by the observations, we propose a simple yet effective regularization by introducing Open-set samples with Dynamic Noisy Labels (ODNL) into training. With ODNL, the extra capacity of the neural network can be largely consumed in a way that does not interfere with learning patterns from clean data. Through the lens of SGD noise, we show that the noises induced by our method are random-direction, conflict-free and biased, which may help the model converge to a flat minimum with superior stability and enforce the model to produce conservative predictions on Out-of-Distribution instances. Extensive experimental results on benchmark datasets with various types of noisy labels demonstrate that the proposed method not only enhances the performance of many existing robust algorithms but also achieves significant improvement on Out-of-Distribution detection tasks even in the label noise setting.

研究の動機と目的

open-set noisy labels がラベルノイズ下で一般化に対して害にならないか、有益になり得るかを調査する。
open-set の補助データが学習ダイナミクスと inherent label noise への頑健性にどのように影響するかを分析する。
動的な open-set labels を用いてニューラルネットワークの容量を消費しつつ clean data の学習を妨げない実用的な正則化技術（ODNL）を提案する。
方法を SGD ノイズと理論的に関連付け、関連アプローチ（OE、SLN、ラベルランダム化）と対比する。
CIFAR-10/100 と Clothing1M の各データセットで有効性を示し、ラベルノイズ下での OOD 検出の改善を含む。

提案手法

訓練データ中の標準的なクロスエントロピーを L1、オープンセットデータには動的にランダムに割り当てられたラベルと一貫性を強制する L2 を組み合わせた訓練目的関数 L_total = L1 + eta * L2 を導入する。
訓練中にラベル集合から一様に抽出された動的なノイズラベルを用いて open-set 補助データと整合性を持たせることで L2 を適用する。
ODNL が SGD ノイズを生み出し、ランダム方向・衝突なし・偏りのあるノイズを誘発して平坦なミ minima へ収束を促し、保守的な OOD 予測を促進することを示す。
ODNL が既存の頑健訓練法（サンプル選択、損失補正、頑健な損失）と組み合わせて性能を高めることができることを示す。
ODNL を関連技術（OE、SLN、DisturbLabel）と比較し、ダイナミズムの違いと inherent label noise への頑健性への影響について議論する。

実験結果

リサーチクエスチョン

RQ1open-set noisy labels が intrinsic なラベルノイズの下で一般化に対して害にならないか、有益になり得るか？
RQ2動的ノイズラベルを用いた open-set 補助データの導入が学習ダイナミクスとノイズ耐性にどう影響するか？
RQ3ODNL は OOD 検出を改善できるか、既存の頑健訓練法と組み合わせて利用できるか？
RQ4ODNL による SGD ノイズと他のラベルノイズ正則化法との関係は？
RQ5実世界のノイズラベルを含む標準ベンチマーク（CIFAR-10/100, Clothing1M）で有効か？

主な発見

補助データセットのサイズが十分に大きい場合、open-set のノイズラベルは inherent なラベルノイズへの頑健性を高め得る。
ODNL は CIFAR-10/100 および Clothing1M で一貫して頑健性を改善し、ラベルノイズ下での OOD 検出を強化する。
ODNL は既存のさまざまな頑健訓練手法（例：Decoupling、F-correction、PHuber-CE、Co-teaching、JoCoR）の性能を強化する。
SGD ノイズの分析は、ODNL がランダム方向・衝突なし・偏りのあるノイズを導入し、平坦な minima へ到達を促し保守的な OOD 予測を生み出すことを示す。
Outlier Exposure (OE) と比較して、ODNL は動的ノイズを提供し、ノイズラベルへの頑健性の改善が優れていることを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。