QUICK REVIEW

[논문 리뷰] AugFPN: Improving Multi-scale Feature Learning for Object Detection

Chaoxu Guo, Bin Fan|arXiv (Cornell University)|2019. 12. 11.

Advanced Neural Network Applications참고 문헌 52인용 수 42

한 줄 요약

AugFPN은 Consistent Supervision, Residual Feature Augmentation, 및 Soft RoI Selection을 도입하여 FPN의 결함을 해결하고 COCO에서 백본과 탐지기 전반에 걸쳐 일관된 AP 이득을 제공합니다.

ABSTRACT

Current state-of-the-art detectors typically exploit feature pyramid to detect objects at different scales. Among them, FPN is one of the representative works that build a feature pyramid by multi-scale features summation. However, the design defects behind prevent the multi-scale features from being fully exploited. In this paper, we begin by first analyzing the design defects of feature pyramid in FPN, and then introduce a new feature pyramid architecture named AugFPN to address these problems. Specifically, AugFPN consists of three components: Consistent Supervision, Residual Feature Augmentation, and Soft RoI Selection. AugFPN narrows the semantic gaps between features of different scales before feature fusion through Consistent Supervision. In feature fusion, ratio-invariant context information is extracted by Residual Feature Augmentation to reduce the information loss of feature map at the highest pyramid level. Finally, Soft RoI Selection is employed to learn a better RoI feature adaptively after feature fusion. By replacing FPN with AugFPN in Faster R-CNN, our models achieve 2.3 and 1.6 points higher Average Precision (AP) when using ResNet50 and MobileNet-v2 as backbone respectively. Furthermore, AugFPN improves RetinaNet by 1.6 points AP and FCOS by 0.9 points AP when using ResNet50 as backbone. Codes will be made available.

연구 동기 및 목표

다중 스케일 특징 활용을 저해하는 FPN 피처 피라미드의 설계 결함을 식별한다.
의미적 간극, 정보 손실, RoI 할당 비효율성을 해결하기 위해 세 가지 구성요소를 갖는 AugFPN을 제안한다.
MS COCO에서 여러 탐지기 및 백본에 걸쳐 AugFPN을 평가하여 강건성과 일반화를 검토한다.
일단계 탐지기 및 이단계 탐지기와의 호환성을 보여준다.

제안 방법

융합 이전에 다중 스케일 특징 맵 간에 유사한 의미 정보를 강제하기 위한 Consistent Supervision.
비율 불변 맥락 특징을 잔차 분기와 Adaptive Spatial Fusion을 통해 최고 수준 피라미드 맵(M5)에 통합하는 Residual Feature Augmentation.
Adaptive Spatial Fusion을 사용하여 모든 피라미드 레벨에서 적응형 RoI 특징 융합을 학습하는 Soft RoI Selection으로 휴리스틱 레벨 할당을 피한다.

실험 결과

연구 질문

RQ1Consistent Supervision이 융합 전에 피라미드 계층 간의 의미적 격차를 줄일 수 있는가?
RQ2최고 수준 특징을 비율 불변 맥락으로 풍부하게 하면 정보 손실을 줄이고 다중 스케일 융합을 개선하는가?
RQ3모든 피라미드 레벨에서의 적응형 학습 가능한 RoI 특징 융합이 휴리스틱 RoI 레벨 할당 및 max/sum 융합보다 우수한가?
RQ4제안된 구성요소가 COCO에서 백본 및 탐지기에 걸쳐 일반화되는가?

주요 결과

FPN을 AugFPN으로 대체하면 ResNet50을 사용하는 Faster R-CNN의 AP가 2.3 증가하여 38.8 AP가 된다.
또한 ResNet101을 사용하는 Faster R-CNN의 AP를 1.7 증가하여 40.6 AP로, ResNext-101 변종에서는 최대 1.4 AP까지 향상시킨다.
MobileNet-V2 백본에서 AugFPN은 Faster R-CNN에 대해 1.6 AP 이득을 낸다.
일단계 탐지기도 이점을 얻는다: RetinaNet은 1.6 AP를 얻고 (ResNet50 또는 MobileNet-v2에서는 1.3 AP), FCOS는 ResNet-50에서 0.9 AP를 얻는다.
AugFPN을 사용할 때 Mask R-CNN은 검출 이득 2.0 AP(ResNet50) 및 1.5 AP(ResNet101)를 보며, 해당 세그먼트 이득이 따른다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.