QUICK REVIEW

[논문 리뷰] Consistent Optimization for Single-Shot Object Detection

Tao Kong, Fuchun Sun|arXiv (Cornell University)|2019. 01. 19.

Advanced Neural Network Applications참고 문헌 42인용 수 19

한 줄 요약

이 논문은 단일 셸터 객체 검출에서 일관된 최적화를 제안하며, 훈련 시 보정된 앵커를 사용하여 훈련 타겟을 추론과 일치시킵니다. RetinaNet를 수정하여 추론 시 사용하는 동일한 보정된 앵커에서 분류 및 회귀를 최적화함으로써, 아키텍처 변경 없이 추가 파rameter 없이 1.0 AP 향상을 달성(ResNet-101 백본 기준 COCO에서 40.1 AP), 기존의 모든 단일 단계 검출기들을 능가합니다.

ABSTRACT

We present consistent optimization for single stage object detection. Previous works of single stage object detectors usually rely on the regular, dense sampled anchors to generate hypothesis for the optimization of the model. Through an examination of the behavior of the detector, we observe that the misalignment between the optimization target and inference configurations has hindered the performance improvement. We propose to bride this gap by consistent optimization, which is an extension of the traditional single stage detector's optimization strategy. Consistent optimization focuses on matching the training hypotheses and the inference quality by utilizing of the refined anchors during training. To evaluate its effectiveness, we conduct various design choices based on the state-of-the-art RetinaNet detector. We demonstrate it is the consistent optimization, not the architecture design, that yields the performance boosts. Consistent optimization is nearly cost-free, and achieves stable performance gains independent of the model capacities or input scales. Specifically, utilizing consistent optimization improves RetinaNet from 39.1 AP to 40.1 AP on COCO dataset without any bells or whistles, which surpasses the accuracy of all existing state-of-the-art one-stage detectors when adopting ResNet-101 as backbone. The code will be made available.

연구 동기 및 목표

단일 셸터 객체 검출기에서 훈련 타겟(기본 앵커)과 추론 예측(보정된 앵커) 간의 불일치 문제를 해결하기 위해.
훈련 시 보정된 앵커를 사용한 일관된 최적화가 검출 정확도 향상에 기여하는지 조사하기 위해.
성능 향상의 원인이 아키텍처 혁신이 아니라 최적화 전략 자체임을 입증하기 위해.
다양한 모델 용량과 입력 해상도에서 안정적이고 거의 비용이 들지 않는 정확도 향상 달성하기 위해.

제안 방법

기본 앵커와 그 보정된 형태(회귀로부터 유도된 것)를 모두 최적화 타겟으로 사용하는 훈련 전략을 도입합니다.
분류 및 회귀 헤드를 수정하여, 추론 시 사용하는 동일한 보정된 앵커 예측에서 최적화하도록 합니다. 이로써 훈련과 추론 간 일치를 확보합니다.
두 개의 스트림을 갖춘 훈련 프로세스를 구현하여, 모델이 추론 시 사용하는 동일한 보정된 앵커 가설에서 분류 및 회귀를 학습하도록 합니다.
캐스케이드 R-CNN와 유사한 설계를 하지만, 다단계 추론을 피하기 위해 단일 단계 검출기용으로 적응시켰습니다.
백본이나 아키텍처를 변경하지 않고 RetinaNet에 일관된 최적화를 적용하여 ConRetinaNet를 구현합니다.
공정한 비교를 위해 스케일 저항성과 더 긴 훈련 스케줄을 적용합니다.

실험 결과

연구 질문

RQ1훈련 타겟(기본 앵커)과 추론 예측(보정된 앵커) 간의 불일치가 단일 셸터 검출기 성능을 제한하는가?
RQ2보정된 앵커 최적화를 통한 훈련-추론 일관성 향상이 측정 가능한 정확도 향상으로 이어지는가?
RQ3성능 향상의 원인이 아키텍처 변경인지 최적화 전략 자체인가?
RQ4일관된 최적화가 다양한 모델 용량과 입력 해상도에서 안정적인 성능 향상을 이끌 수 있는가?

주요 결과

ResNet-101 기반 RetinaNet에서 일관된 최적화를 통해 COCO에서 39.1 AP에서 40.1 AP로 향상되었으며, 이는 기존의 모든 단일 단계 검출기들을 능가합니다.
추가 파rameter나 특수 기능 없이, 다양한 모델 용량과 입력 해상도에서 안정적인 성능 향상이 관찰되었습니다.
ConRetinaNet-ResNet-101은 COCO test-dev에서 각각 44.2 AP, 43.5 AP, 53.3 AP를 기록하여 RefineDet, DSSD, CornerNet를 능가합니다.
설계 선택 사항에 대한 분석 결과, 성능 향상의 원인이 최적화 일관성임을 입증하였으며, 아키텍처 설계 때문이 아님을 확인했습니다.
ResNet-50 백본을 사용할 경우에도 ConRetinaNet는 40.2 AP를 기록하여 DSSD-ResNet-101과 RefineDet-ResNet-101를 모두 능가합니다.
계산량과 파arameter 측면에서 거의 비용이 들지 않아, 기존의 단일 단계 검출기들에 널리 적용 가능합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.