QUICK REVIEW

[논문 리뷰] Distilling Object Detectors with Feature Richness

Zhixing Du, Rui Zhang|arXiv (Cornell University)|2021. 11. 01.

Advanced Neural Network Applications참고 문헌 31인용 수 36

한 줄 요약

본 논문은 객체 탐지기의 증류에서 정보를 많이 담은 특징을 선택하기 위한 Feature Richness Score(FRS)를 제안하고, 바깥 박스를 활용하며 박스 내부의 오분류를 제거하는 방법으로 anchor 기반, anchor-free, 및 두 단계 탐지기에 걸친 개선을 보인다.

ABSTRACT

In recent years, large-scale deep models have achieved great success, but the huge computational complexity and massive storage requirements make it a great challenge to deploy them in resource-limited devices. As a model compression and acceleration method, knowledge distillation effectively improves the performance of small models by transferring the dark knowledge from the teacher detector. However, most of the existing distillation-based detection methods mainly imitating features near bounding boxes, which suffer from two limitations. First, they ignore the beneficial features outside the bounding boxes. Second, these methods imitate some features which are mistakenly regarded as the background by the teacher detector. To address the above issues, we propose a novel Feature-Richness Score (FRS) method to choose important features that improve generalized detectability during distilling. The proposed method effectively retrieves the important features outside the bounding boxes and removes the detrimental features within the bounding boxes. Extensive experiments show that our methods achieve excellent performance on both anchor-based and anchor-free detectors. For example, RetinaNet with ResNet-50 achieves 39.7% in mAP on the COCO2017 dataset, which even surpasses the ResNet-101 based teacher detector 38.9% by 0.8%. Our implementation is available at https://github.com/duzhixing/FRS.

연구 동기 및 목표

Bounding box를 넘어선 정보가 풍부한 특징에 주목하여 객체 탐지기의 지식 증류를 제안한다.
박스 내부뿐만 아니라 모든 영역에서 객체 유사한 특징을 식별하기 위한 Feature Richness Score(FRS)를 제안한다.
Plug-and-Play 증류 프레임워크를 시연하여 앵커 기반/앵커 프리 및 두 단계 탐지기에 적용 가능함을 보인다.
FRS가 박스 밖 정보를 활용하고 박스 내부의 잘못 분류된 특징을 감소시켜 일반화 가능한 탐지 성능을 향상시킨다.]
method:[
피처 리치니스 S를 교사 분류 점수 P(c|f, theta)를 사용하여 카테고리 중 최대값으로 정의한다.
해당 교사 분류 점수로부터 피라미드 각 레벨에 대한 특징 풍부성 마스크 S_l를 계산한다.
마스크를 사용하여 픽셀 단위 증류 손실에 가중치를 주고 FPN 계층(L_FPN)과 분류 헤드(L_head) 모두를 증류한다.
표준 GT 손실과 결합: L = L_GT + alpha L_FPN + beta L_head.

제안 방법

Define feature richness S as the max over categories of P(c|f, theta) using teacher classification scores.
Compute per-pyramid-level feature richness masks S_l from corresponding teacher classification scores.
Distill both FPN layers (L_FPN) and the classification head (L_head) using masks to weight pixel-wise distillation losses.
Combine losses with standard GT loss: L = L_GT + alpha L_FPN + beta L_head.

실험 결과

연구 질문

RQ1박스 바깥의 특징이 풍부한 영역이 물체 탐지기의 증류에 유익한 지도를 제공할 수 있는가?
RQ2픽셀 단위의 특징 풍부성 마스크로 증류의 가중치를 주는 것이 탐지기 유형 전반에서 학생 모델의 성능을 향상시키는가?
RQ3COCO 데이터셋에서 FRS가 anchor-based, anchor-free, 두 단계 탐지기에 걸쳐 어떻게 성능을 보이는가?
RQ4박스 외부 특징과 박스 내부의 잘못 라벨링된 특징이 증류의 품질에 어느 정도 영향을 미치는가?

주요 결과

mode	mAP	AP50	AP75	AP_S	AP_M	AP_L
Retina-Res101(teacher) 2x	38.9	58.0	41.5	21.0	42.8	52.4
Retina-Res50(student) 2x	37.4	56.7	39.6	20.0	40.7	49.7
ours 2x	39.7	58.6	42.4	21.8	43.5	52.4

FRS는 RetinaNet, GFL, FCOS, Faster R-CNN에서 학생으로 ResNet-50, 교사로 ResNet-101를 사용할 때 개선을 보인다.
RetinaNet-Res50(2x 스케줄)에서 39.7% mAP를 달성하여 교사(38.9%)를 0.8% 포인트 상회
GFL-Res50(1x)에서 기준 대비 3.4% 포인트 향상; 2x에서 일부 설정은 지표 전반에서 1.8–4.2% 이상 증가.
FCOS-Res50(2x)에서 40.9% mAP를 달성하여 교사 성능을 상회.
Ablation에서 FPN과 분류 헤드 증류 모두 이득에 기여하며, 결합 시 최대 2.3% mAP의 향상을 보인다.
정성적 분석 및 엔트로피 분석은 TP+FP 영역(박스 밖의 정보가 풍부한 영역)이 증류에 특히 유익하다고 시사한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.