QUICK REVIEW

[논문 리뷰] Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect

Kaihua Tang, Jianqiang Huang|arXiv (Cornell University)|2020. 09. 27.

Fault Detection and Control Systems인용 수 231

한 줄 요약

논문은 SGD 모멘텀을 장기 꼬리 분류에서 교란변수로 다루는 인과 프레임워크를 소개하고, 교란 제거 학습을 통해 직접 효과를 추론하여 다수 벤치마크에서 최첨단 결과를 달성한다.

ABSTRACT

As the class size grows, maintaining a balanced dataset across many classes is challenging because the data are long-tailed in nature; it is even impossible when the sample-of-interest co-exists with each other in one collectable unit, e.g., multiple visual instances in one image. Therefore, long-tailed classification is the key to deep learning at scale. However, existing methods are mainly based on re-weighting/re-sampling heuristics that lack a fundamental theory. In this paper, we establish a causal inference framework, which not only unravels the whys of previous methods, but also derives a new principled solution. Specifically, our theory shows that the SGD momentum is essentially a confounder in long-tailed classification. On one hand, it has a harmful causal effect that misleads the tail prediction biased towards the head. On the other hand, its induced mediation also benefits the representation learning and head prediction. Our framework elegantly disentangles the paradoxical effects of the momentum, by pursuing the direct causal effect caused by an input sample. In particular, we use causal intervention in training, and counterfactual reasoning in inference, to remove the "bad" while keep the "good". We achieve new state-of-the-arts on three long-tailed visual recognition benchmarks: Long-tailed CIFAR-10/-100, ImageNet-LT for image classification and LVIS for instance segmentation.

연구 동기 및 목표

가중 재조정/재샘플링 휴리스틱을 넘어선 장기 꼬리 분류의 원리적 이해 필요성을 제시한다.
SGD 모멘텀이 장기 꼬리 설정에서 교란변수 및 매개변수로 작용하는 인과 모델을 개발한다.
다시 학습 없이 하나의 단계로 직접 인과 효과를 매개에서 분리하여 꼬리 정확도를 개선하는 해결책을 제시한다.
Long-tailed CIFAR-10/-100, ImageNet-LT, LVIS 등 벤치마크에서 실제 향상을 보여 이론을 검증한다.

제안 방법

모멘텀 M, 특징 X, 헤드-프로젝션 D, 예측 Y를 갖는 인과 그래프를 구성하여 교란 및 매개를 모델링한다.
역문백(백도어 보정)을 적용하여 역확률 가중치를 통해 P(Y|do(X))를 추정하는 교란 제거 학습 objective를 도출한다.
P(Y=i|do(X=x))의 로짓을 에너지 기반의 다-헤드 정규화 분류기(Eq. 7)로 표현한다.
X에 대해 Y의 총 직접 효과(TDE)를 대안적 차이를 제거하는 반사적 추론으로 계산한다(Eq. 8).
배경 클래스가 있는 작업에 대해 Background-Exempted Inference를 사용하여 헤드 편향을 유지하며 TDE 기반 예측을 평가한다.
이론적으로 Table 1의 두 단계 및 정규화 기반 방법과의 연결을 제시하고 교란- TDE가 대안보다 언제 우수한지 설명한다.

실험 결과

연구 질문

RQ1장기 꼬리 데이터에서 SGD 모멘텀이 특징 표현과 예측에 인과적으로 어떤 영향을 미치는가?
RQ2좋은 매개 효과를 유지하면서 나쁜 교란 효과를 제거해 꼬리 정확도를 향상시킬 수 있는가?
RQ3교란 제거 학습과 직접 효과 추론(TDE)을 결합한 단일 단계 재학습 불필요한 솔루션이 다양한 데이터셋에서 강건한 이득을 제공하는가?
RQ4제안된 접근법이 기존의 재균형 및 정규화 기반 방법과 어떻게 연결되고 이를 설명하는가?

주요 결과

Long-tailed 시각 인식 벤치마크 세 개에서 새로운 최첨단을 달성하고 LVIS의 객체 검출/분할에서도 최첨단을 달성한다.
같은 Cascade Mask R-CNN 백본을 사용하여 LVIS에서 마스크 AP를 3.5% 포인트, 박스 AP를 3.1% 포인트 절대적 향상을 보였다.
교란 제거 학습과 TDE 추론은 많은 샷, 중간 샷, 적은 샷 regime에서 이전의 재균형 및 단일 단계 방법을 일관되게 능가한다.
왜 이단계 훈련 방법이 작동하는지에 대한 원칙적 설명과 왜 단일 단계 교란- TDE가 더 효과적이고 훈련 효율적인지 설명한다.
교란-해제-TDE가 구분 가능한 영역에 집중하고 넓은 맥락 대신에 차별화 가능한 영역에 집중함을 시각화하여 직접 효과 강조와 일치한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.