QUICK REVIEW

[论文解读] Understanding and Improving Early Stopping for Learning with Noisy Labels

Yingbin Bai, Erkun Yang|arXiv (Cornell University)|Jun 30, 2021

Machine Learning and Data Classification参考文献 34被引用 53

一句话总结

提出 Progressive Early Stopping (PES)，将 DNN 分成多部分训练，较早的层获得更多训练轮次，较晚的层获得更少轮次，以更好地利用记忆能力并改善带噪标签的学习。

ABSTRACT

The memorization effect of deep neural network (DNN) plays a pivotal role in many state-of-the-art label-noise learning methods. To exploit this property, the early stopping trick, which stops the optimization at the early stage of training, is usually adopted. Current methods generally decide the early stopping point by considering a DNN as a whole. However, a DNN can be considered as a composition of a series of layers, and we find that the latter layers in a DNN are much more sensitive to label noise, while their former counterparts are quite robust. Therefore, selecting a stopping point for the whole network may make different DNN layers antagonistically affected each other, thus degrading the final performance. In this paper, we propose to separate a DNN into different parts and progressively train them to address this problem. Instead of the early stopping, which trains a whole DNN all at once, we initially train former DNN layers by optimizing the DNN with a relatively large number of epochs. During training, we progressively train the latter DNN layers by using a smaller number of epochs with the preceding layers fixed to counteract the impact of noisy labels. We term the proposed method as progressive early stopping (PES). Despite its simplicity, compared with the early stopping, PES can help to obtain more promising and stable results. Furthermore, by combining PES with existing approaches on noisy label training, we achieve state-of-the-art performance on image classification benchmarks.

研究动机与目标

Motivate and analyze how label noise affects different layers in a DNN during training.
Propose PES to progressively train network parts with layer-dependent early stopping epochs.
Show that PES better leverages memorization and reduces sensitivity to noisy labels.
Demonstrate that combining PES with existing noisy-label techniques yields state-of-the-art results on benchmarks.

提出的方法

Split a DNN into L parts and train the first part for T1 epochs while optimizing the full network.
Progressively fix previous parts and train the l-th part for Tl epochs with earlier parts fixed, ensuring Tl decreases with l.
Justify that latter layers are more sensitive to noise and thus benefit from shorter training per stage.
Define confident examples from PES-trained models via augmented-consensus predictions and use class-weighted losses.
Combine PES with semi-supervised learning (MixMatch) to utilize unlabeled/noisy data effectively.
Provide algorithmic steps (Algorithm 1) for PES with optional semi-supervised refinement.

实验结果

研究问题

RQ1How does label noise differently affect internal DNN layers during training, and can this be exploited to improve learning with noisy labels?
RQ2Does progressively stopping/training parts of a network outperform traditional whole-network early stopping under various noise types and levels?
RQ3Can PES be effectively combined with confident-example selection and semi-supervised learning to achieve state-of-the-art results?
RQ4What are the empirical gains of PES on synthetic (CIFAR-10/100) and real-world (Clothing-1M) noisy-label benchmarks?

主要发现

PES consistently yields higher test accuracy and lower variance than traditional early stopping across symmetric, pairflip, and instance-dependent noise on CIFAR-10/100.
PES improves label precision and recall for confident examples, enhancing the quality of selected labels.
PES combined with semi-supervised learning outperforms baselines and achieves state-of-the-art results on CIFAR-10/100 with synthetic noise and competitive results on Clothing-1M.
Sensitivity analysis shows best performance when the second and third parts have Tl around 7 and 5 epochs respectively, with robustness across noise types.
PES offers training-time comparable overhead to standard early stopping while delivering superior performance.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。