QUICK REVIEW

[논문 리뷰] Understanding and Improving Early Stopping for Learning with Noisy Labels

Yingbin Bai, Erkun Yang|arXiv (Cornell University)|2021. 06. 30.

Machine Learning and Data Classification참고 문헌 34인용 수 53

한 줄 요약

프로그레시브 얼리 스톱핑(PES)을 제안합니다. DNN을 부분적으로 학습시키고, 초기 층에는 더 많은 에폭을, 후반 층에는 더 적은 에폭을 부여하여 메모라이제이션을 더 잘 활용하고 노이즈 라벨 학습을 개선합니다.

ABSTRACT

The memorization effect of deep neural network (DNN) plays a pivotal role in many state-of-the-art label-noise learning methods. To exploit this property, the early stopping trick, which stops the optimization at the early stage of training, is usually adopted. Current methods generally decide the early stopping point by considering a DNN as a whole. However, a DNN can be considered as a composition of a series of layers, and we find that the latter layers in a DNN are much more sensitive to label noise, while their former counterparts are quite robust. Therefore, selecting a stopping point for the whole network may make different DNN layers antagonistically affected each other, thus degrading the final performance. In this paper, we propose to separate a DNN into different parts and progressively train them to address this problem. Instead of the early stopping, which trains a whole DNN all at once, we initially train former DNN layers by optimizing the DNN with a relatively large number of epochs. During training, we progressively train the latter DNN layers by using a smaller number of epochs with the preceding layers fixed to counteract the impact of noisy labels. We term the proposed method as progressive early stopping (PES). Despite its simplicity, compared with the early stopping, PES can help to obtain more promising and stable results. Furthermore, by combining PES with existing approaches on noisy label training, we achieve state-of-the-art performance on image classification benchmarks.

연구 동기 및 목표

레이블 노이즈가 DNN의 다양한 층에 미치는 영향을 학습 과정에서 동기부여하고 분석한다.
층별로 차등적인 얼리스톱 에폭으로 네트워크 부분을 점진적으로 학습시키도록 PES를 제안한다.
PES가 메모라이제이션을 더 잘 활용하고 노이즈 라벨에 대한 민감도를 감소시키는 것을 보여준다.
PES를 기존의 노이즈 라벨 기법과 결합하면 벤치마크에서 최첨단 성능을 달성한다.

제안 방법

DNN을 L개 부분으로 나누고 처음 부분을 T1 에폭 동안 학습시키는 동안 전체 네트워크를 최적화한다.
이전 부분을 점진적으로 고정하고 l번째 부분을 Tl 에폭 동안 학습시키되 앞부분은 고정되도록 하며 Tl이 l에 따라 감소하도록 한다.
나중의 층이 노이즈에 더 민감하므로 단계별 학습 시간(Tl)이 더 짧으면 이점이 있다고 정당화한다.
PES로 학습된 모델에서 확신 있는 예제를 augmented-consensus 예측을 통해 정의하고 클래스 가중 손실을 사용한다.
PES를 반(Self)지도 학습(MixMatch)과 결합하여 라벨이 없거나 노이즈가 있는 데이터를 효과적으로 활용한다.
PES의 알고리즘적 단계(Algorithm 1)와 선택적 반( Semi-supervised refinement)을 제공한다.

실험 결과

연구 질문

RQ1레이블 노이즈가 학습 중 내부 DNN 층에 다르게 영향을 미치는가, 그리고 이를 활용하여 노이즈 라벨 학습을 개선할 수 있는가?
RQ2네트워크의 부분을 점진적으로 중단/학습시키는 것이 다양한 노이즈 유형과 수준에서 전통적인 전체 네트워크 조기 종료를 능가하는가?
RQ3PES를 확신 예제 선택 및 반(Self)지도 학습과 효과적으로 결합하여 최첨단 결과를 달성할 수 있는가?
RQ4합성 노이즈(CIFAR-10/100)와 실제 노이즈 벤치마크(Clothing-1M)에서 PES의 경험적 이득은 무엇인가?

주요 결과

PES는 CIFAR-10/100의 대칭, 페어플립, 인스턴스 의존 노이즈에서 전통적 조기 종료보다 테스트 정확도와 분산이 일관되게 더 높다.
PES는 확신 있는 예제의 라벨 정확도와 재현율을 향상시켜 선택된 라벨의 품질을 높인다.
PES와 반(Self)지도 학습의 결합은 베이스라인보다 우수하며 합성 노이즈가 있는 CIFAR-10/100에서 최첨단 성능을 달성하고 Clothing-1M에서도 경쟁력 있는 결과를 보여준다.
민감도 분석에서 두 번째 및 세 번째 파트의 Tl가 각각 약 7과 5 에폭일 때 최상의 성능이 나오며 노이즈 유형에 대해 강인성을 보인다.
PES는 표준 조기 종료와 비교해 학습 시간 측면의 추가 오버헤드가 비슷하지만 더 뛰어난 성능을 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.