QUICK REVIEW

[論文レビュー] Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection

Hongkai Zhang, Hong Chang|arXiv (Cornell University)|Jul 16, 2019

Advanced Neural Network Applications参考文献 35被引用数 50

ひとこと要約

Cas-RetinaNet は、段階的に IoU 閾値を上げるカスケード型の単段検出機と、段階間で特徴を揃える Feature Consistency Module を導入し、COCO 上で RetinaNet に対して一貫した AP 増加を実現する（例：39.1 から 41.1 AP）。

ABSTRACT

Recent researches attempt to improve the detection performance by adopting the idea of cascade for single-stage detectors. In this paper, we analyze and discover that inconsistency is the major factor limiting the performance. The refined anchors are associated with the feature extracted from the previous location and the classifier is confused by misaligned classification and localization. Further, we point out two main designing rules for the cascade manner: improving consistency between classification confidence and localization performance, and maintaining feature consistency between different stages. A multistage object detector named Cas-RetinaNet, is then proposed for reducing the misalignments. It consists of sequential stages trained with increasing IoU thresholds for improving the correlation, and a novel Feature Consistency Module for mitigating the feature inconsistency. Experiments show that our proposed Cas-RetinaNet achieves stable performance gains across different models and input scales. Specifically, our method improves RetinaNet from 39.1 AP to 41.1 AP on the challenging MS COCO dataset without any bells or whistles.

研究の動機と目的

単段検出器におけるカスケード手法の主な制約を特定し、特に分類と位置合わせの不一致、および段階間の特徴の不一致を挙げる。
設計指針を提案する： (a) 分類信頼度を定位品質と揃える、(b) 段階間で特徴の整合性を維持する。
逐次的な段階を持つ Cas-RetinaNet と新規の Feature Consistency Module (FCM) を開発する。
バックボーンと入力スケールを跨ぐ MS COCO での性能向上を示し、段階数と推論コストのトレードオフを分析する。

提案手法

分類目標と定位品質を揃えるため、 IoU 閾値を順次高くする段階を追加したカスケード型の単段検出器を提案する。
位置オフセットを学習し、変形畳み込みを用いてアンカー位置を精練する特徴整合モジュール (Feature Consistency Module) を導入する。
L = sum of L^i across stages を用いて各段を訓練し、陽性は段階特有の IoU 閾値で決定する。
推論時には複数のカスケード段の分類スコアを平均化して頑健性を高める。
リターナネットに合わせた軽量ヘッド構造を維持し、カスケードと FCM への改良を分離する。

実験結果

リサーチクエスチョン

RQ1region-based proposals を使わずに、カスケード様の単段検出器は検出性能を改善できるか？
RQ2カスケード設定で分類信頼度を実際の定位品質にどう整合させるか？
RQ3Feature Consistency Module を介してカスケード段間で特徴を適応させると、ずれを低減し精度を向上さるか？
RQ4精度と速度のバランスを取る最適なカスケード段数はどれか？

主な発見

Cas-RetinaNet は COCO で RetinaNet に対して一貫した AP 増加をもたらす（例：test-dev で ResNet-101 and 800 input の場合、RetinaNet が 39.1 AP から Cas-RetinaNet が 41.1 AP へ）。
後続の段階で前景 IoU 閾値を上げると、低 IoU の AP には小さな影響で高 IoU の性能 (例: AP90) が向上する。二段階カスケードは良いトレードオフを提供する。
Feature Consistency Module は backbone や入力スケールを跨って一貫して AP を約1ポイント改善する（例：ResNet-50 600→600: 34.4→35.5 with FCM; 800: 36.1→37.1）。
二段階カスケードを用いた Cas-RetinaNet が実験で最良の全体的トレードオフを達成（Table 3）。
最新の検出器と比較して、ResNet-101 を用いた Cas-RetinaNet は bells and whistles を使わずに競争力のある、あるいは優れた結果を達成（COCO test-dev で 800 input の場合 41.1 AP）。
推論速度には控えめなオーバーヘッドが発生（例：800 input の場合、1 追加段を持つ Cas-RetinaNet は約 10 FPS、RetinaNet は約 12.5 FPS）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。