QUICK REVIEW

[논문 리뷰] Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition

Yifan Zhang, Bryan Hooi|arXiv (Cornell University)|2021. 07. 20.

Domain Adaptation and Few-Shot Learning인용 수 55

한 줄 요약

SADE는 단일의 긴 꼬리 데이터셋에서 다양한 스킬의 전문가를 여러 명 학습하고, 미지의 테스트 클래스 분포를 사전에 알지 못해도 적응할 수 있도록 자기지도 기반의 테스트 타임 집계를 사용한다.

ABSTRACT

Existing long-tailed recognition methods, aiming to train class-balanced models from long-tailed data, generally assume the models would be evaluated on the uniform test class distribution. However, practical test class distributions often violate this assumption (e.g., being either long-tailed or even inversely long-tailed), which may lead existing methods to fail in real applications. In this paper, we study a more practical yet challenging task, called test-agnostic long-tailed recognition, where the training class distribution is long-tailed while the test class distribution is agnostic and not necessarily uniform. In addition to the issue of class imbalance, this task poses another challenge: the class distribution shift between the training and test data is unknown. To tackle this task, we propose a novel approach, called Self-supervised Aggregation of Diverse Experts, which consists of two strategies: (i) a new skill-diverse expert learning strategy that trains multiple experts from a single and stationary long-tailed dataset to separately handle different class distributions; (ii) a novel test-time expert aggregation strategy that leverages self-supervision to aggregate the learned multiple experts for handling unknown test class distributions. We theoretically show that our self-supervised strategy has a provable ability to simulate test-agnostic class distributions. Promising empirical results demonstrate the effectiveness of our method on both vanilla and test-agnostic long-tailed recognition. Code is available at \url{https://github.com/Vanint/SADE-AgnosticLT}.

연구 동기 및 목표

테스트 분포가 균일하지 않은 실용적인 롱테일 인식을 목표로 한다.
하나의 긴 꼬리 데이터셋에서 학습된 다양한 전문가를 다수 개발한다.
미지의 테스트 분포를 처리하기 위한 자기지도 학습 기반의 테스트 시점 집계를 제안한다.

제안 방법

세 가지 서로 다른 손실로 세 명의 전문가를 학습시켜 긴꼬리, 균등, 역긴꼬리 분포를 커버한다(전방, 균등, 후방).
전방 전문가 E1은 교차엔트로피를 사용해 긴꼬리 학습 분포를 시뮬레이션한다.
균등 전문가 E2는 균형 소프트맥스를 사용해 균일 분포를 시뮬레이션한다.
역방향 전문가 E3은 역소프트맥스 손실을 사용해 역긴꼬리 분포를 시뮬레이션한다.
레이블이 없는 테스트 샘플의 두 개의 증강 뷰에서 예측 안정성을 최대화해 테스트 시점의 집계 가중치를 학습한다.
집계는 y_hat = softmax(w1 v1 + w2 v2 + w3 v3)로 수행되며 w는 합이 1이 되도록 정규화된다.
이론적 근거는 목표가 예측 분포와 실제 테스트 분포 간의 상호정보량 최대화와 관련이 있음을 보인다.

실험 결과

연구 질문

RQ1단일 긴꼬리 학습 세트에서 어떻게 다수의 다양한 전문가를 만들 수 있는가?
RQ2사전 테스트 분포 지식 없이도 자기지도 집계가 미지의 테스트 분포에 적응할 수 있는가?
RQ3테스트 시점에서 예측 안정성 극대화가 미지의 테스트 분포를 효과적으로 시뮬레이션하는 가중치를 산출하는가?

주요 결과

SADE는 여러 데이터셋에서 강력한 기본 롱테일 인식 성능을 달성한다.
ImageNet-LT에서 SADE는 66.5(Many), 57.0(Med), 43.5(Few), 58.8(All)을 달성하며 다수의 베이스라인보다 성능이 앞선다.
SADE는 다수 설정에서 RIDE와 ACE를 능가한다.
테스트 무관 롱테일 인식에서 SADE는 알려진 테스트 사전 정보를 활용하는 방법들(LADE 등)을 능가하고 전방, 균등, 역방향 분포에서도 견고한 성능을 보인다.
예측 안정성은 전문가 강도와 상관관계가 있으며, 테스트 시점의 효과적인 자기지도 집계를 가능하게 한다.
이론적 분석은 예측 안정성을 상호정보량 I(Y_hat; Y) 및 테스트 분포 정렬과 연관지어 설명한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.