QUICK REVIEW

[論文レビュー] Confidence Scores Make Instance-dependent Label-noise Learning Possible

Antonin Berthon, Bo Han|arXiv (Cornell University)|Jan 11, 2020

Machine Learning and Data Classification参考文献 58被引用数 36

ひとこと要約

この論文は信頼度スコア付きインスタンス依存ノイズ（CSIDN）と、 per-instance confidence を用いてクラス遷移確率を推定するインスタンスレベル前向き補正（ILFC）手法を紹介し、インスタンス依存ラベルノイズの下で頑健な学習を可能にし、syntheticとrealデータでベースラインを上回る。

ABSTRACT

In learning with noisy labels, for every instance, its label can randomly walk to other classes following a transition distribution which is named a noise model. Well-studied noise models are all instance-independent, namely, the transition depends only on the original label but not the instance itself, and thus they are less practical in the wild. Fortunately, methods based on instance-dependent noise have been studied, but most of them have to rely on strong assumptions on the noise models. To alleviate this issue, we introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is equipped with a confidence score. We find with the help of confidence scores, the transition distribution of each instance can be approximately estimated. Similarly to the powerful forward correction for instance-independent noise, we propose a novel instance-level forward correction for CSIDN. We demonstrate the utility and effectiveness of our method through multiple experiments under synthetic label noise and real-world unknown noise.

研究の動機と目的

現実的でインスタンス依存の設定において、ラベルの破損がクラスと特徴の両方に依存する状況で、ノイズ付き学習を動機づける。
各インスタンスごとの信頼度スコアを導入してノイズ遷移を近似し、IDNを扱いやすくする。
CSIDNモデルと実用的なアルゴリズム（ILFC）を提案し、インスタンス固有の遷移を推定して頑健な分類器を訓練する。
合成実験とClothing1Mを含む実データセットを通じて有効性を示す。

提案手法

各インスタンスに対して信頼度スコア r_x = P(Y= bar{y} | bar{Y}= bar{y}, X=x) を定義する。
インスタンス依存だが対角項 T_{i,i}(x) が r_x と密度比 β_i(x) で推定可能な、ノイズ遷移行列 T(x) を持つ CSIDN を定式化する。
非対角項は T_{i,j}(x) = α_{i,j} (1 - T_{i,i}(x)) を満たすと仮定し、α_{i,j} はインスタンス間で一定、アンカーポイントを用いて推定する。
ノイズデータと分類器出力から β_i(x) と T_{i,i}(x) を推定する反復的手順を開発する。
インスタンスレベルの前向き補正lossを定義し、ILFCのステップで分類器 h を訓練する。アンカーポイントベースの α と T の推定、および密度比の更新を含む。
実用的な訓練アルゴリズム（Algorithm 1）を提供し、T の推定、loss の訂正、β の更新を Naiveなノイズ付き分類器と現在のモデルを用いて交互に行う。

実験結果

リサーチクエスチョン

RQ1信頼度スコアを各インスタンスに付与することで、インスタンス依存のノイズ遷移の推定を実用的に可能にできるか。
RQ2信頼度スコアとアンカーポイントを用いて、インスタンスごとの対角遷移確率とクラス間反転確率をどう推定するか。
RQ3ILFCは合成データおよび実データセット全体で、インスタンス依存ラベルノイズに対する頑健性を改善するか。
RQ4Clothing1M のように初期には信頼度注釈が部分的または全く利用できない場合でも、信頼度スコアは頑健性と有用性を保つか（perturbation に対して頑健か）。

主な発見

ILFCは高レベルのインスタンス依存ノイズ下でベースラインが失敗する状況でも頑健な性能を達成する。
合成データでは、ILFCは前向き補正、MAE、Lq-norm、Co-teaching よりも優れており、特にノイズ率が高い場合に有利。
実データセット（SVHN、CIFAR-10）では、ILFCは収束が速く、IDN が大きくても高い精度を示す。
Clothing1M では、ILFC は Forward、MAE、Lq-norm、Co-teaching を上回り、報告されている最高の精度を達成。
信頼度スコアが摂動された場合でも効果を維持し、信頼度推定の不完全性に対して頑健である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。