QUICK REVIEW

[논문 리뷰] PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment

Kaixin Wang, Jun Hao Liew|arXiv (Cornell University)|2019. 08. 18.

Domain Adaptation and Few-Shot Learning참고 문헌 28인용 수 132

한 줄 요약

PANet은 소수-shot 세그먼트를 위한 비모수 기반의 프로토타입 학습 접근법을 사용하고, 지원 및 질의 프로토타입을 정렬하기 위한 프로토타입 정렬 규제를 도입하여 PASCAL-5i와 MS COCO에서 최첨단 결과를 달성했습니다.

ABSTRACT

Despite the great progress made by deep CNNs in image semantic segmentation, they typically require a large number of densely-annotated images for training and are difficult to generalize to unseen object categories. Few-shot segmentation has thus been developed to learn to perform segmentation from only a few annotated examples. In this paper, we tackle the challenging few-shot segmentation problem from a metric learning perspective and present PANet, a novel prototype alignment network to better utilize the information of the support set. Our PANet learns class-specific prototype representations from a few support images within an embedding space and then performs segmentation over the query images through matching each pixel to the learned prototypes. With non-parametric metric learning, PANet offers high-quality prototypes that are representative for each semantic class and meanwhile discriminative for different classes. Moreover, PANet introduces a prototype alignment regularization between support and query. With this, PANet fully exploits knowledge from the support and provides better generalization on few-shot segmentation. Significantly, our model achieves the mIoU score of 48.1% and 55.7% on PASCAL-5i for 1-shot and 5-shot settings respectively, surpassing the state-of-the-art method by 1.8% and 8.6%.

연구 동기 및 목표

지원 이미지에서 학습된 클래스 특정 프로토타입을 기반으로 한 소수-shot 분할 프레임워크를 개발한다.
비모수(metric) 학습에서 프로토타입 추출을 분리하여 일반화를 향상시킨다.
훈련 중 지원과 질의의 프로토타입을 정렬하기 위해 프로토타입 정렬 정규화를 활용한다.
지원 세트에 대해 낙서( scribbles)나 경계 상자(bounding boxes)와 같은 더 약한 주석에 대한 강인성을 입증한다.

제안 방법

공유 백본으로 지원 및 질의 이미지를 임베딩하여 특징 맵을 얻는다.
지원 특징에 대해 각 클래스 및 배경에 대해 마스킹된 평균 풀링으로 클래스 프로토타입을 계산한다.
고정 스케일링 상수를 가진 코사인 거리로 임베딩 공간에서 가장 가까운 프로토타입으로 질의 픽셀을 분할한다.
질의 기반 마스크를 예측하여 지원 이미지를 재분할하고 PAR 손실을 계산하는 방식으로 프로토타입 정렬 정규화를 적용한다.
L_seg와 PAR 손실 항(L = L_seg + lambda * L_PAR)을 함께 사용하여 엔드 투 엔드로 학습한다.
지원 세트에 대해 더 약한 주석(낙서, 경계 상자)을 선택적으로 확장한다.

실험 결과

연구 질문

RQ1무거운 디코더 모듈 없이도 비모수적(prototype-based) 메트릭 학습 접근법이 경쟁력 있는 소수-shot 세그먼트를 달성할 수 있는가?
RQ2학습 중 지원 및 질의 프로토타입 간의 정렬을 강제하는 것이 보이지 않는(class unseen) 클래스에 대한 일반화를 향상시킬까?
RQ3표준 벤치마크(PASCAL-5i, MS COCO)에서 1-shot 및 5-shot 설정과 더 약한 주석 하에서 PANet의 성능은 어떠한가?

주요 결과

PANet은 PASCAL-5i에서 1-shot mean-IoU 48.1% 및 5-shot mean-IoU 55.7%를 달성하여 이전 방법을 능가한다.
PANet은 PASCAL-5i에서 5-shot mean-IoU에서 최첨단 대비 최대 8.6%의 성능 향상을 보여준다.
프로토타입 정렬 정규화(PAR)은 수렴 속도를 더 빠르게 하고 지원 및 질의 프로토타입 간의 정렬을 더 촘촘하게 만든다(프로토타입 간의 유클리드 거리 감소).
PANet은 MS COCO에서 1-shot 및 5-shot 설정으로 최상위를 달성하며, 이전 방법들보다 상당한 차이로 성능을 능가한다.
PANet은 지원 세트의 낙서나 경계 상자와 같은 약한 주석에서도 효과적으로 작동한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.