QUICK REVIEW

[논문 리뷰] Self-supervised Label Augmentation via Input Transformations

Hankook Lee, Sung Ju Hwang|arXiv (Cornell University)|2019. 10. 14.

Machine Learning and Data Classification인용 수 54

한 줄 요약

논문은 Self-supervised Label Augmentation (SLA)을 소개합니다. 이는 입력 변환으로부터 원래 라벨/자기지도 라벨를 결합한 공동 태스크를 학습하여 테스트 시 집계가 가능하고 자체 증류를 통해 완전 지도 태스크에서 강한 이득을 얻는 단일 모델을 달성합니다.

ABSTRACT

Self-supervised learning, which learns by constructing artificial labels given only the input signals, has recently gained considerable attention for learning representations with unlabeled datasets, i.e., learning without any human-annotated supervision. In this paper, we show that such a technique can be used to significantly improve the model accuracy even under fully-labeled datasets. Our scheme trains the model to learn both original and self-supervised tasks, but is different from conventional multi-task learning frameworks that optimize the summation of their corresponding losses. Our main idea is to learn a single unified task with respect to the joint distribution of the original and self-supervised labels, i.e., we augment original labels via self-supervision of input transformation. This simple, yet effective approach allows to train models easier by relaxing a certain invariant constraint during learning the original and self-supervised tasks simultaneously. It also enables an aggregated inference which combines the predictions from different augmentations to improve the prediction accuracy. Furthermore, we propose a novel knowledge transfer technique, which we refer to as self-distillation, that has the effect of the aggregated inference in a single (faster) inference. We demonstrate the large accuracy improvement and wide applicability of our framework on various fully-supervised settings, e.g., the few-shot and imbalanced classification scenarios.

연구 동기 및 목표

레이블이 주어져 있어도 자기지도 신호를 활용하고 동기를 부여합니다.
Semantic content를 바꿀 수 있는 변환에 대해 불변성을 강제하지 않습니다.
원래 라벨을 자기지도 라벨로 보강하기 위한Unified Joint-Label 학습 프레임워크를 제안합니다.
단일 모델 내에서 앙상블처럼 보일 수 있도록 집계 기반 추론을 활성화합니다.
집계된 지식을 더 빠른 단일 패스 추론으로 전달하기 위한 자체 증류 메커니즘을 도입합니다.

제안 방법

(SLA) 손실 L_SLA를 (원래 라벨, 변환) 쌍에 대한 joint softmax rho를 사용하여 정의합니다.
P(i,j|x̃)를 rho_{ij}(z̃; w)로 표현하고 교차 엔트로피를 (y, j)로 최소화합니다.
M개의 변환에 걸친 집계를 사용하여 로짓 w_{ij}^T z̃_j로부터 P_aggregated(i|x)를 계산합니다.
집계된 지식을 단일 분류기 u로 전이시키기 위한 자기 증류 L_SLA+SD를 도입합니다. KL 발산과 옵션 CE 손실을 사용합니다.
성능 향상을 위해 회전(M=4) 및 색상 치환(M=6) 두 가지 변환과 합성 변환을 실험합니다.
매 이터레이션마다 모든 M개의 증강 샘플을 입력으로 제공하여 L_SLA를 최적화하고 항등성을 t_1으로 사용하여 학습합니다.

실험 결과

연구 질문

RQ1자기지도 라벨이 라벨이 있는 데이터가 있을 때 불변성 제약을 피하면서 정확도를 향상시킬 수 있을까요?
RQ2공동 라벨 SLA가 전통적 데이터 증강이나 다중 작업 자기지도에 비해 정확도 및 학습 난이도 측면에서 이점을 제공합니까?
RQ3Augmented 샘플에 대한 집계가 앙상블 이점을 달성할 수 있으며 자체 증류가 더 빠른 추론으로 이 이점을 유지할 수 있을까요?
RQ4SLA 변형은 표준, 소수 샷 및 불균형 분류 작업에서 어떻게 성능을 발휘합니까?
RQ5회전 및 색상-치환 증강 조합의 영향과 변환의 구성의 효과는 어떠합니까?

주요 결과

SLA는 회전 또는 색상 치환을 사용하여 CIFAR-10/100 및 tiny-ImageNet에서 베이스라인보다 상당한 정확도 향상을 제공합니다.
회전 기반 SLA는 CIFAR-100에서 최대 8.60% 상대 개선과 CUB200에서 최대 18.8%를 달성합니다(집계 하에서).
집계(SLA+AG)는 독립 모델들의 앙상블에 거의 근접한 성능을 단일 모델로 달성합니다.
자체 증류(SLA+SD)는 다른 증강과 결합할 때 더 빠른 추론과 경쟁력 있는 정확도를 제공합니다.
SLA는 소수 샷 및 불균형 설정에서 성능을 향상시키며, 예를 들어 5샷 FC100에서 최대 7.05% 상대 이익, 불균형 CIFAR-100에서 최대 13.3%를 달성합니다.
여러 변환을 구성하는 것은(CUB200 및 Stanford Dogs와 같은 세부 구분 데이터셋에서) 집계 결과를 더 향상시킬 수 있습니다.
SLA는 최신 증강 방법과의 호환성을 보여주며(예: CIFAR-10/100에서 Cutout, CutMix, AutoAugment 등과 결합 시 정확도 향상).

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.