QUICK REVIEW

[논문 리뷰] Robust Training under Label Noise by Over-parameterization

Sheng Liu, Zhihui Zhu|arXiv (Cornell University)|2022. 02. 28.

Machine Learning and Data Classification인용 수 29

한 줄 요약

희소 과다매개화(SOP)를 도입하여 과다매개화된 분류기에서 희소한 라벨 노이즈를 깨끗한 데이터로부터 분리하고, 이론적 및 실험적 지지로 손상된 라벨에 대한 강건성이 향상됨을 보인다.

ABSTRACT

Recently, over-parameterized deep networks, with increasingly more network parameters than training samples, have dominated the performances of modern machine learning. However, when the training data is corrupted, it has been well-known that over-parameterized networks tend to overfit and do not generalize. In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted. The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data. Specifically, we model the label noise via another sparse over-parameterization term, and exploit implicit algorithmic regularizations to recover and separate the underlying corruptions. Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets. Furthermore, our experimental results are corroborated by theory on simplified linear models, showing that exact separation between sparse noise and low-rank data can be achieved under incoherent conditions. The work opens many interesting directions for improving over-parameterized models by using sparse over-parameterization and implicit regularization.

연구 동기 및 목표

과다매개화된 심층 네트워크에서 학습 라벨이 손상되었을 때 강건한 학습을 촉진한다.
학습 중에 희소 라벨 노이즈를 데이터로부터 분리하는 실용적인 알고리즘을 제안한다.
단순화된 선형 모델에서 정확한 분리를 보이는 이론적 통찰을 제공한다.
합성 및 실제 데이터 집합 전반에서 라벨 노이즈에 대한 실험적 강건성을 보여준다.

제안 방법

알 수 없는 라벨 노이즈를 보조 희소 항 s_i로 모델링하는데, 이는 s_i = u_i ⊙ u_i − v_i ⊙ v_i 로 분해된다.
네트워크 매개변수 θ와 보조 변수 {u_i, v_i}에 대해 결합 목적 함수를 최적화하여 y_i ≈ f(x_i; θ) + s_i를 적합시킨다.
(u_i, v_i)에 대해 ατ, θ에 대해 τ의 서로 다른 학습률로 경사 하강법을 적용하여 암시적 정규화를 유도한다.
이로써 희소 노이즈 s_i에 대한 ℓ1 패널티를 유도하고, 강건한 희소 모델링과의 연계를 보여준다.
교차 엔트로피 및 MSE 손실과 u_i, v_i에 대한 제약을 강제하기 위한 적절한 프로젝션이 있는 구현 변형을 제공한다.
단순화된 과다매개화 선형 모델에 대한 이론적 분석은 비상관성 및 저랭크 조건하에서 희소 노이즈를 데이터로부터 정확히 분리함을 보여준다.

실험 결과

연구 질문

RQ1일부 라벨이 손상된 상황에서 과다매개화된 모델을 강건하게 학습시킬 수 있는가?
RQ2학습 중 보조 희소 과다매개화 항이 라벨 노이즈와 깨끗한 데이터를 분리하도록 하는가?
RQ3제안된 보조 변수에 대한 그래디언트 동역학으로부터 어떤 암시적 정규화 효과가 발생하는가?
RQ4단순화된 선형 모델에 대한 이론적 결과가 SOP에서 관찰된 실험적 강건성을 설명하는가?

주요 결과

SOP는 잘못된 학습 라벨에 대한 과적합을 방지하고, 여러 데이터셋에서 라벨 노이즈 하의 테스트 정확도를 더 높게 달성한다.
SOP+는 일관성 및 클래스-밸런스 정규화를 도입하여 성능을 더욱 향상시킨다.
실험적 결과는 SOP와 SOP+가 CIFAR-10/100에서 합성 및 현실적 라벨 노이즈로 여러 베이스라인을 능가했고, Clothing-1M과 WebVision에서도 우수한 성능을 보임을 시사한다.
단순화된 선형 모델의 이론적 분석은 비상관성과 저랭크 데이터 가정하에서 그래디언트 동역학이 희소한 왜곡과 함께 실제 파라미터를 회복하고, 노이즈에 대해 ℓ1 정규화 효과를 보임을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.