QUICK REVIEW

[논문 리뷰] Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning

Wenbin Li, Lei Wang|arXiv (Cornell University)|2019. 03. 28.

Domain Adaptation and Few-Shot Learning참고 문헌 22인용 수 43

한 줄 요약

이 논문은 이미지-클래스 측정치를 사용하는 로컬 디스크립터 기반의 DN4를 도입하고, 에피소드 학습으로 학습하여 여러 가지 few-shot 벤치마크에서 최첨단 결과를 달성합니다.

ABSTRACT

Few-shot learning in image classification aims to learn a classifier to classify images when only few training examples are available for each class. Recent work has achieved promising classification performance, where an image-level feature based measure is usually used. In this paper, we argue that a measure at such a level may not be effective enough in light of the scarcity of examples in few-shot learning. Instead, we think a local descriptor based image-to-class measure should be taken, inspired by its surprising success in the heydays of local invariant features. Specifically, building upon the recent episodic training mechanism, we propose a Deep Nearest Neighbor Neural Network (DN4 in short) and train it in an end-to-end manner. Its key difference from the literature is the replacement of the image-level feature based measure in the final layer by a local descriptor based image-to-class measure. This measure is conducted online via a $k$-nearest neighbor search over the deep local descriptors of convolutional feature maps. The proposed DN4 not only learns the optimal deep local descriptors for the image-to-class measure, but also utilizes the higher efficiency of such a measure in the case of example scarcity, thanks to the exchangeability of visual patterns across the images in the same class. Our work leads to a simple, effective, and computationally efficient framework for few-shot learning. Experimental study on benchmark datasets consistently shows its superiority over the related state-of-the-art, with the largest absolute improvement of $17\%$ over the next best. The source code can be available from \UrlFont{https://github.com/WenbinLee/DN4.git}.

연구 동기 및 목표

few-shot 학습에서 최종 분류를 이미지 수준에서 로컬 디스크립터 기반 측정으로 이동시켜 재고를 촉발한다.
같은 클래스로부터의 이미지 간 로컬 시각 패턴의 전이성(transferability)과 교환가능성(exchangeability)을 활용한다.
딥 로컬 디스크립터와 비모수적 이미지-클래스 측정치를 결합한 끝-끝으로 학습 가능한 프레임워크를 제안한다.
표준 few-shot 벤치마크에서 기존의 메트릭 학습 및 메타-러닝 방법에 비해 실험적으로 향상을 보인다.

제안 방법

CNN을 이용해 이미지를 임베딩하고 합성곱 특성 맵으로부터 깊은 로컬 디스크립터를 얻는다.
쿼리 디스크립터에 대해 클래스 디스크립터 풀과의 k-최근접 이웃 탐색을 수행하여 로컬 디스크립터마다 이미지-클래스 측정치를 구성한다.
모든 디스크립터와 그들의 k-NN 매칭에 대해 코사인 유사도를 모아 분류를 위한 클래스 점수를 얻는다.
에피소드 학습(C-way K-shot task) 내에서 임베딩과 비모나치 측정을 끝-끝으로 학습한다.
임베딩 모듈로 Conv-64F(또는 ResNet-256F와 같은 더 깊은 백본도 가능)를 사용한다.
하이퍼파라미터 k를 조정하고 다양한 설정에서의 강건성을 보인다.

실험 결과

연구 질문

RQ1로컬 디스크립터 기반 이미지-클래스 측정치가 이미지 수준 특성에 비해 few-shot 분류를 개선하는가?
RQ2DN4처럼 로컬 디스크립터 기반의 비모나치 분류기를 엔드-투-엔드로 학습하면 표준 메트릭 학습 및 메타-러닝 접근법을 능가하는가?
RQ3하이퍼파라미터(k, 백본, 과대/과소 매칭)가 데이터셋 전반에서 DN4의 성능에 어떤 영향을 주는가?

주요 결과

DN4는 mini ImageNet에서 5-way 1-shot 및 5-way 5-shot 작업에서 몇몇 최첨단 메트릭 학습 방법보다 더 높은 정확도를 달성한다(예: 51.24% 대 49.42% 및 71.02% 대 68.20%).
로컬 디스크립터를 깊은 특징으로 대체하고 이미지-클래스 측정치를 사용하면 특히 미세한 차이를 가진 데이터셋에서 큰 향상을 이룬다.
임베딩 모듈 외에는 테스트 시 비모나치한 상태를 유지하며, 학습은 처음부터 끝까지 가능하다.
더 깊은 백본(ResNet-256F)이 성능을 추가로 향상시키며 예를 들어 5-shot에서 74.44%를 달성한다.
어블레이션 연구에서 이미지-클래스 측정치가 이미지-이미지 버전보다 우수하며, 클래스 내 로컬 패턴의 교환가능성으로부터 이점이 생긴다.
DN4는 메타-러닝 베이스라인과 경쟁력이 있으며 5-shot 설정에서 흔히 그들을 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.