Skip to main content
QUICK REVIEW

[Paper Review] Meta Transition Adaptation for Robust Deep Learning with Noisy Labels

Jun Shu, Qian Zhao|arXiv (Cornell University)|Jun 10, 2020
Machine Learning and Data Classification49 references22 citations
TL;DR

This paper proposes a meta-transition adaptation method that leverages a small set of clean-label meta-data to jointly optimize the noise transition matrix and classifier, avoiding anchor-point assumptions and improving robustness in noisy label settings. The method achieves statistically consistent transition matrix estimation and outperforms SOTA methods in both synthetic and real-world benchmarks, including under no-noise scenarios.

ABSTRACT

To discover intrinsic inter-class transition probabilities underlying data, learning with noise transition has become an important approach for robust deep learning on corrupted labels. Prior methods attempt to achieve such transition knowledge by pre-assuming strongly confident anchor points with 1-probability belonging to a specific class, generally infeasible in practice, or directly jointly estimating the transition matrix and learning the classifier from the noisy samples, always leading to inaccurate estimation misguided by wrong annotation information especially in large noise cases. To alleviate these issues, this study proposes a new meta-transition-learning strategy for the task. Specifically, through the sound guidance of a small set of meta data with clean labels, the noise transition matrix and the classifier parameters can be mutually ameliorated to avoid being trapped by noisy training samples, and without need of any anchor point assumptions. Besides, we prove our method is with statistical consistency guarantee on correctly estimating the desired transition matrix. Extensive synthetic and real experiments validate that our method can more accurately extract the transition matrix, naturally following its more robust performance than prior arts. Its essential relationship with label distribution learning is also discussed, which explains its fine performance even under no-noise scenarios.

Motivation & Objective

  • To address the limitations of prior methods that rely on strong anchor-point assumptions or suffer from inaccurate transition matrix estimation in high-noise regimes.
  • To develop a meta-learning framework that jointly optimizes the classifier and transition matrix using a small set of clean-label meta-data.
  • To provide a statistical consistency guarantee for accurate transition matrix estimation under noisy label conditions.
  • To improve model generalization and robustness, even in the absence of label noise, by capturing inter-class ambiguity inherent in real-world data.

Proposed method

  • The method uses a small set of clean-label meta-data to guide the joint optimization of the noise transition matrix and the classifier parameters via meta-learning.
  • It formulates a bi-level optimization problem where the outer loop minimizes cross-entropy loss on meta-data, and the inner loop optimizes the classifier and transition matrix on noisy training data.
  • The transition matrix is estimated as a class-conditional probability distribution over label flips, learned through gradient updates guided by meta-data.
  • The approach avoids explicit anchor-point assumptions by relying on the meta-data to stabilize estimation and prevent overfitting to noisy labels.
  • Theoretical analysis proves statistical consistency of the estimated transition matrix under mild regularity conditions.
  • The method naturally extends to label distribution learning by approximating soft label distributions from hard labels via the learned transition matrix.

Experimental results

Research questions

  • RQ1Can a meta-learning strategy improve the accuracy of noise transition matrix estimation without requiring anchor-point assumptions?
  • RQ2How does the proposed method perform compared to SOTA methods in high-noise and low-noise label regimes?
  • RQ3Can the method enhance model generalization and robustness under distribution shift and adversarial attacks?
  • RQ4What is the theoretical relationship between the proposed method and label distribution learning?

Key findings

  • The proposed method achieves more accurate transition matrix estimation than prior SOTA methods on both synthetic and real-world noisy datasets, including Clothing1M.
  • On Clothing1M, the method significantly reduces top-1 error compared to cross-entropy training and other SOTA baselines, demonstrating improved robustness.
  • The method achieves test accuracy within 1% of soft-label training (CIFAR10H) on out-of-distribution datasets, showing strong generalization.
  • Under FGSM and PGD adversarial attacks, the method maintains higher accuracy and lower cross-entropy loss than hard-label training, indicating improved robustness.
  • The method generalizes well even in no-noise scenarios, outperforming standard cross-entropy training by capturing inter-class ambiguity through learned transition matrices.
  • Theoretical analysis confirms that the method provides a statistically consistent estimate of the true transition matrix under mild conditions.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.