QUICK REVIEW

[논문 리뷰] Gradient Harmonized Single-stage Detector

Buyu Li, Yu Liu|arXiv (Cornell University)|2018. 11. 13.

Anomaly Detection Techniques and Applications참고 문헌 35인용 수 67

한 줄 요약

이 논문은 Gradient Harmonizing Mechanism (GHM)을 도입하여 일단 스테이지 탐지기의 그래디언트 기여를 균형 있게 조정하고, 분류용 GHM-C와 회귀용 GHM-R를 제안하여 과도한 하이퍼-파라미터 튜닝 없이 COCO에서 최첨단 성능을 달성한다.

ABSTRACT

Despite the great success of two-stage detectors, single-stage detector is still a more elegant and efficient way, yet suffers from the two well-known disharmonies during training, i.e. the huge difference in quantity between positive and negative examples as well as between easy and hard examples. In this work, we first point out that the essential effect of the two disharmonies can be summarized in term of the gradient. Further, we propose a novel gradient harmonizing mechanism (GHM) to be a hedging for the disharmonies. The philosophy behind GHM can be easily embedded into both classification loss function like cross-entropy (CE) and regression loss function like smooth-$L_1$ ($SL_1$) loss. To this end, two novel loss functions called GHM-C and GHM-R are designed to balancing the gradient flow for anchor classification and bounding box refinement, respectively. Ablation study on MS COCO demonstrates that without laborious hyper-parameter tuning, both GHM-C and GHM-R can bring substantial improvement for single-stage detector. Without any whistles and bells, our model achieves 41.6 mAP on COCO test-dev set which surpasses the state-of-the-art method, Focal Loss (FL) + $SL_1$, by 0.8.

연구 동기 및 목표

일단 스테이지 탐지기에서 학습 불균형의 원인(클래스/속성 불균형)을 식별한다.
학습 중 그래디언트 기여를 균형 있게 조정하는 그래디언트 기반 조화 메커니즘을 제안한다.
데이터 배치에 과도한 하이퍼파라미터 튜닝 없이 적응하는 분류용 GHM-C와 회귀용 GHM-R를 개발한다.
RetinaNet 스타일의 단일 스테이지 탐지기를 사용하여 COCO에서 개선을 입증하고 Focal Loss 및 기타 베이스라인과 비교한다.

제안 방법

학습 샘플 전체에서 그래디언트 노름 g의 분포로 그래디언트 밀도 GD(g)를 정의한다.
샘플 i당 그래디언트 조화 매개변수 beta_i = N / GD(g_i)를 계산하여 손실에 재가중치를 부여한다.
일반 CE 손실을 L_GHM-C = (1/N) sum_i beta_i L_CE(p_i, p_i*)로 대체하여 GHM-C를 형식화한다.
그레디언트 gr를 갖는 ASL1(Authentic Smooth L1)을 도입하고 L_GHM-R = (1/N) sum_i beta_i ASL1(d_i)를 적용하여 회귀에 GHM을 확장한다.
유닛 영역(epsilon)을 사용한 그래디언트 밀도 근사와 EMA 스무딩으로 미니배치 업데이트의 안정성을 보장한다.
GHM이 각 배치의 데이터 분포에 적응하고 쉬운 부정 샘플과 이상치의 지배력을 줄임을 보인다.

실험 결과

연구 질문

RQ1그라디언트 밀도 기반 재가중이 일단 스테이지 탐지기의 학습 효율성과 정확도를 향상시키는가?
RQ2GHM-C와 GHM-R이 각각 COCO 벤치마크에서 교차 엔트로피 및 스몰 L1 손실과 어떻게 비교되는가?
RQ3제안된 EMA 기반 그래디언트 밀도 추정이 대규모 데이터셋에서 안정적이고 확장 가능한 학습을 제공하는가?
RQ4GHM 접근법이 2단계 탐지기 및 다른 백본으로 이전하면서 정확도를 유지하거나 향상시키는가?

주요 결과

방법	네트워크	AP	AP50	AP75	AP_S	AP_M	AP_L
Faster RCNN	FPN-ResNet-101	36.2	59.1	39.0	18.2	39.0	48.2
Mask RCNN	FPN-ResNet-101	38.2	60.3	41.7	20.1	41.1	50.2
Mask RCNN	FPN-ResNeXt-101	39.8	62.3	43.4	22.1	43.2	51.2
YOLOv3	DarkNet-53	33.0	57.9	34.4	18.3	35.4	41.9
DSSD513	DSSD-ResNet-101	33.2	53.3	35.2	13.0	35.4	51.1
Focal Loss	RetinaNet-FPN-ResNet-101	39.1	59.1	42.3	21.8	42.7	50.2
Focal Loss	RetinaNet-FPN-ResNeXt-101	40.8	61.1	44.1	24.1	44.2	51.2
GHM-C + GHM-R	RetinaNet-FPN-ResNet-101	39.9	60.8	42.5	20.3	43.6	54.1
GHM-C + GHM-R	RetinaNet-FPN-ResNeXt-101	41.6	62.8	44.2	22.3	45.1	55.3

GHM-C는 표준 CE 대비 분류 성능을 크게 향상시키고 COCO에서 Focal Loss와 경쟁력 있거나 더 우수하다.
GHM-R은 SL1 및 ASL1보다 바운딩 박스 회귀를 향상시키며 특히 높은 IoU 임계값에서 더 나은 로컬라이제이션을 나타낸다.
GHM-C와 GHM-R의 조합은 RetinaNet과 함께 COCO test-dev에서 최첨단 수준의 결과에 근접하며 Focal Loss 변형을 능가한다.
유닛 영역 근사(M 약 30)로 학습은 여전히 효율적이며 순수 밀도에 비해 상당히 빠르면서도 성능 이득을 유지한다.
GHM 접근법은 2단계 탐지기로 확장되어 Faster R-CNN 변형에서 SL1 기준선보다 AP가 향상된다.
COCO test-dev에서 RetinaNet-ResNet-101과 함께 GHM-C + GHM-R은 39.9 AP를 달성하고 ResNeXt-101과 함께 41.6 AP를 달성하여 Focal Loss 베이스라인을 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.