QUICK REVIEW

[논문 리뷰] Rethinking Classification and Localization for Object Detection

Yue Wu, Yinpeng Chen|arXiv (Cornell University)|2019. 04. 13.

Advanced Neural Network Applications참고 문헌 47인용 수 40

한 줄 요약

논문은 완전연결(fc-head)과 합성곱(conv-head) 탐지 헤드가 분류(classification)와 위치(localization)에 서로 다르게 미치는 영향을 분석하고, fc-head를 분류에, conv-head를 바운딩 박스 회귀에 결합한 이중 헤드 탐지기(Double-Head detector)를 도입하여 COCO에서 주목할 만한 AP 향상을 얻는다는 내용이다.

ABSTRACT

Two head structures (i.e. fully connected head and convolution head) have been widely used in R-CNN based detectors for classification and localization tasks. However, there is a lack of understanding of how does these two head structures work for these two tasks. To address this issue, we perform a thorough analysis and find an interesting fact that the two head structures have opposite preferences towards the two tasks. Specifically, the fully connected head (fc-head) is more suitable for the classification task, while the convolution head (conv-head) is more suitable for the localization task. Furthermore, we examine the output feature maps of both heads and find that fc-head has more spatial sensitivity than conv-head. Thus, fc-head has more capability to distinguish a complete object from part of an object, but is not robust to regress the whole object. Based upon these findings, we propose a Double-Head method, which has a fully connected head focusing on classification and a convolution head for bounding box regression. Without bells and whistles, our method gains +3.5 and +2.8 AP on MS COCO dataset from Feature Pyramid Network (FPN) baselines with ResNet-50 and ResNet-101 backbones, respectively.

연구 동기 및 목표

fc-head와 conv-head가 두단계 탐지기에서 분류와 위치지정에 어떤 영향을 미치는지 이해한다.
MS COCO 2017 검증 세트에서 미리 정의된 제안에 대해 fc-head와 conv-head를 경험적으로 비교한다.
두 헤드의 보완적 강점과 약점을 식별한다.
두 헤드를 모두 활용하는 공동 아키텍처(Double-Head)를 제안하여 탐지를 개선한다.
비집중 작업(unfocused tasks) 확장을 통해 정확도를 더욱 높이는 방법을 탐구한다.

제안 방법

ResNet-50과 함께 FPN에서 fc-head와 conv-head를 훈련시키고 분류 대비 위치지정 성능을 평가한다.
출력 피처 맵을 분석하여 공간 민감도와 IoU와의 상관관계를 측정한다.
분류를 위한 fc-head와 바운딩 박스 회귀를 위한 conv-head를 갖는 Double-Head 아키텍처를 제안한다.
추가로 Unfocused Task 감독 및 추론 시 분류기 융합을 포함한 Double-Head-Ext로 확장한다.
백본 및 헤드 구성에 대한 제거 실험(ablation)으로 COCO 및 VOC07에서 평가한다.

실험 결과

연구 질문

RQ1fc-head와 conv-head가 분류와 위치지정에 대해 상보적 강점을 가지는가?
RQ2공간 민감도는 fc-head와 conv-head에서 어떻게 다른가, 그리고 이것이 IoU 상관에 어떤 영향을 미치는가?
RQ3태스크를 두 헤드로 분리하는 것이 단일 헤드 기준선보다 탐지 성능을 개선하는가?
RQ4비집중 작업과 분류기 융합을 도입하는 것이 정확도를 더 높이는가?

주요 결과

방법	백본	AP	AP 0.5	AP 0.75	AP S	AP M	AP L
FPN baseline	ResNet-50	36.8	58.7	40.4	21.2	40.1	48.8
Double-Head	ResNet-50	39.8	59.6	43.6	22.7	42.9	53.1
Double-Head-Ext	ResNet-50	40.3	60.3	44.2	22.4	43.3	54.3
FPN baseline	ResNet-101	39.1	61.0	42.4	22.2	42.5	51.0
Double-Head	ResNet-101	41.5	61.7	45.6	23.8	45.2	54.9
Double-Head-Ext	ResNet-101	41.9	62.4	45.9	23.9	45.2	55.8

fc-head는 특히 소형 물체에서 IoU와 더 강하게 상관된 분류 점수를 더 높게 산출한다.
conv-head는 fc-head보다 바운딩 박스 회귀를 더 정확하게 제공한다.
Double-Head(fc-head: 분류, conv-head: 회귀)는 ResNet-50 및 ResNet-101 백본에서 COCO의 단일 헤드 기준선 모두를 능가한다.
Double-Head-Ext는 비집중 작업 감독 및 분류기 융합을 통해 결과를 더 개선하고 단일 학습 단계에서 COCO test-dev에서 최첨단 유사 이득을 달성한다.
VOC07에서 Double-Head-Ext는 AP, AP@0.5, AP@0.75 전반에 걸쳐 FPN 기준선을 뚜렷한 차이로 상회한다.
COCO val2017 결과에서 Double-Head-Ext는 ResNet-101 기준 42.3 AP 및 다양한 임계값에서 49% 이상의 AP를 달성한다(기준선 대비).

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.