QUICK REVIEW

[논문 리뷰] Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

Yifan Zhang, Bryan Hooi|arXiv (Cornell University)|2021. 02. 12.

Domain Adaptation and Few-Shot Learning참고 문헌 78인용 수 29

한 줄 요약

본 논문은 Core-tuning을 도입합니다. 이는 hard-pair 마이닝과 focal contrastive loss를 활용하여 다운스트림 분류 및 세그먼트 성능을 향상시키는 대조적(self-supervised) 시각 모델의 대조 규칙화 미세 조정 방법입니다.

ABSTRACT

Contrastive self-supervised learning (CSL) has attracted increasing attention for model pre-training via unlabeled data. The resulted CSL models provide instance-discriminative visual features that are uniformly scattered in the feature space. During deployment, the common practice is to directly fine-tune CSL models with cross-entropy, which however may not be the best strategy in practice. Although cross-entropy tends to separate inter-class features, the resulting models still have limited capability for reducing intra-class feature scattering that exists in CSL models. In this paper, we investigate whether applying contrastive learning to fine-tuning would bring further benefits, and analytically find that optimizing the contrastive loss benefits both discriminative representation learning and model optimization during fine-tuning. Inspired by these findings, we propose Contrast-regularized tuning (Core-tuning), a new approach for fine-tuning CSL models. Instead of simply adding the contrastive loss to the objective of fine-tuning, Core-tuning further applies a novel hard pair mining strategy for more effective contrastive fine-tuning, as well as smoothing the decision boundary to better exploit the learned discriminative feature space. Extensive experiments on image classification and semantic segmentation verify the effectiveness of Core-tuning.

연구 동기 및 목표

동기: 대비(self-supervised) 학습(CSL) 모델의 미세 조정을 개선하여 이들의 판별 특성 공간을 더 잘 활용하도록 한다.
미세 조정 중 대조적 손실이 규칙화 및 최적화 이점을 제공함을 입증한다.
hard 샘플 마이닝과 부드러운 분류기 학습을 통해 다운스트림 성능을 향상시키기 위해 Core-tuning을 개발한다.
이미지 분류 및 의미론적 세분화뿐만 아니라 도메인 일반화 및 강건성 측면에서의 효과를 보인다.

제안 방법

대조적 손실이 표현 학습을 규칙화하고 미세 조정을 최적화하는 방법에 대한 이론적 분석(정리 1 및 2).
anchor마다 hard 양의 샘플과 hard 음의 샘플을 생성하기 위해 hardness-directed mixup을 포함한 Core-tuning의 도입.
미세 조정 중 hard positive 쌍에 더 높은 가중치를 주기 위해 focal contrastive loss를 사용하는 것.
손실 재가중화를 위한 정규화된 대비 특성을 얻기 위한 projection head G_c.
보다 부드러운 분류기를 학습하기 위한 Mixup 기반 데이터 증강과 함께 교차 엔트로피 손실과 focal contrastive 손실의 결합.
훈련 목표: L_ce^m + η * L_con^f를 최소화하고 일반화 향상을 위한 mixer 기반의 분류기 학습을 수행.

실험 결과

연구 질문

RQ1CSL 모델의 미세 조정에 대조적 학습을 적용하는 것이 표준 교차 엔트로피 미세 조정에 비해 다운스트림 성능을 향상시키는가?
RQ2hard 샘플 마이닝과 분류기 스무딩을 어떻게 통합하여 대조적 미세 조정의 이점을 극대화할 수 있는가?
RQ3Core-tuning이 CSL 모델의 도메인 일반화 및 적대적 강건성에 어떤 영향을 미치는가?
RQ4제안된 접근법이 아키텍처, 사전 학습 방법 및 의미론적 세분화와 같은 다운스트림 작업에 걸쳐 일반화 가능한가?

주요 결과

Core-tuning은 CE-조정 및 다른 베이스라인에 비해 여러 데이터셋에서 미세 조정 성능을 크게 향상시킨다.
고유의 구성요소들(하드 페어 마이닝, focal 손실, 믹스업 기반 혼합, 부드러운 분류기 학습)이 각각 이득에 기여함을 확인하는 소거 시험(ablation) 결과가 나타난다.
Core-tuning은 MoCo-v2 사전 학습 ResNet-50으로 학습된 9개의 자연 이미지 데이터셋에서 평균 top-1 정확도가 CE-조정보다 높고 여러 베이스라인을 능가한다.
CSL 사전 학습 백본에서 미세 조정될 때 PASCAL VOC에서 의미론적 세분화 성능도 향상된다.
Core-tuning은 PACS, VLCS, Office-Home 데이터셋에서 교차 도메인 일반화가 더 우수하며 적대적 학습(적대적 훈련 언급) 설정에서도 강건성을 보인다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.