QUICK REVIEW

[論文レビュー] Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection

Bowen Cheng, Yunchao Wei|arXiv (Cornell University)|Oct 5, 2018

Advanced Neural Network Applications参考文献 53被引用数 43

ひとこと要約

本論文は Decoupled Classification Refinement (DCR) を導入し、分類と局在化を分離して難易度の高い誤検出を抑制し、VOCおよびCOCOベンチマークでmAPを向上させるモジュールを提案する。

ABSTRACT

In this paper, we analyze failure cases of state-of-the-art detectors and observe that most hard false positives result from classification instead of localization and they have a large negative impact on the performance of object detectors. We conjecture there are three factors: (1) Shared feature representation is not optimal due to the mismatched goals of feature learning for classification and localization; (2) multi-task learning helps, yet optimization of the multi-task loss may result in sub-optimal for individual tasks; (3) large receptive field for different scales leads to redundant context information for small objects. We demonstrate the potential of detector classification power by a simple, effective, and widely-applicable Decoupled Classification Refinement (DCR) network. In particular, DCR places a separate classification network in parallel with the localization network (base detector). With ROI Pooling placed on the early stage of the classification network, we enforce an adaptive receptive field in DCR. During training, DCR samples hard false positives from the base detector and trains a strong classifier to refine classification results. During testing, DCR refines all boxes from the base detector. Experiments show competitive results on PASCAL VOC and COCO without any bells and whistles. Our codes are available at: https://github.com/bowenc0221/Decoupled-Classification-Refinement.

研究の動機と目的

最先端の領域ベース検出器の失敗モードを分析し、局在化ではなく分類が難しい誤検出の原因となる箇所を特定する。
ベース検出器の局在化を変更せずに分類を精練するデカップルドなアーキテクチャを提案する。
デカップルド分類が Faster RCNN の variants および一般的なベンチマーク全体で一貫した性能向上をもたらすことを実証する。

提案手法

Decoupled Classification Refinement (DCR) を提案し、ベース検出器と並列に別個の分類ネットワークを配置する。
DCR のために初期段階で ROI 処理を適用して適応受容野を利用し、物体サイズの文脈に焦点を当てる。
ベース検出器から高信頼度の誤検出をサンプリングして、それらを修正する強力な分類器を訓練することで DCR を学習させる。
DCR の2つの派生版を開発: DCR V1 (Naïve、Decoupled、Separate training) および DCR V2 (より高速、Shared backbone を用いたEnd-to-end、Top-sampling 戦略)。
DCR V2 ではバックボーン特徴を部分的に共有して速度と精度のバランスを取るとともに、適応受容野を実現するためROIプーリングをより早いネットワーク段階に配置する。
訓練では組み合わせ損失を最適化する: L = L_RPN + L_RCNN + L_DCRV2 (for DCR V2)。
推論手法として top-sampling を提供し、上位スコアの検出のみを処理することで DCRV2 の実行時間を削減する。

実験結果

リサーチクエスチョン

RQ1領域ベース検出器において分類と局在化を分離することで難しい誤検出を抑制できるか？
RQ2適応受容野と部分的な特徴共有は検出性能と速度を向上させるか？
RQ3標準ベンチマーク全体で DCR V1 と DCR V2 は精度と効率の点でどう比較されるか？

主な発見

DCR は難易度の高い誤検出を大幅に減らし、PASCAL VOC 2007 の強力なベースラインに対する mAP を改善する（約 2.7% の向上が言及されている）。
ResNet-101 バックボーンで、PASCAL VOC 2007 の mAP が 84.2%、PASCAL VOC 2012 が 81.2% を達成。
COCO test-dev で DCR は 43.5% の mAP を達成。
分類と局在化をデカップリングし、適応受容野を用いることは、完全に共有特徴のアーキテクチャよりも良い性能を示し、アブレーションでは設定によって最大で 4.6% の mAP 増分を示す。
top-sampling を備えた DCR V2 は、Faster RCNN のベースラインに近い速度と精度のトレードオフを実現し、精度を向上させる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。