QUICK REVIEW

[論文レビュー] Prime Sample Attention in Object Detection

Yuhang Cao, Kai Chen|arXiv (Cornell University)|Apr 9, 2019

Advanced Neural Network Applications参考文献 35被引用数 44

ひとこと要約

本論文は Prime Sample Attention (PISA) を提案します。これは Hierarchical Local Rank (HLR) と分類を意識した回帰損失を用い、プライムサンプル（高影響力を持つ正/負提案）へ訓練を集中させるサンプリング学習戦略で、mAP を改善し、COCO および VOC ベンチマークでランダムサンプリングとハードマイニングを上回ります。

ABSTRACT

It is a common paradigm in object detection frameworks to treat all samples equally and target at maximizing the performance on average. In this work, we revisit this paradigm through a careful study on how different samples contribute to the overall performance measured in terms of mAP. Our study suggests that the samples in each mini-batch are neither independent nor equally important, and therefore a better classifier on average does not necessarily mean higher mAP. Motivated by this study, we propose the notion of Prime Samples, those that play a key role in driving the detection performance. We further develop a simple yet effective sampling and learning strategy called PrIme Sample Attention (PISA) that directs the focus of the training process towards such samples. Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector. Particularly, On the MSCOCO dataset, PISA outperforms the random sampling baseline and hard mining schemes, e.g., OHEM and Focal Loss, consistently by around 2% on both single-stage and two-stage detectors, even with a strong backbone ResNeXt-101.

研究の動機と目的

全ミニバッチのサンプルが mAP に等しく寄与すると仮定する前提を問い直す。
検出性能に最も影響を与えるサンプルを特定し、それらをどのようにランク付けするかを明らかにする。
訓練中にプライムサンプルを強調する実用的なサンプリングと損失戦略を提案する。
COCO および VOC で二段階検出器と単段検出器の両方において改善を示す。

提案手法

プライムサンプルを検出性能へ最も影響力のあるサンプルとして定義する。
ミニバッチ内で正サンプルを IoU で、負サンプルをスコアでランク付けする階層的ローカルランク (HLR) を導入する。
HLR の順位を正のサンプルには損失ウェイトへ、負のサンプルにはウェイトへ変換する Importance-based Sample Reweighting (ISR) を開発する。
分類と回帰をサンプル依存の重み付けで同時最適化する Classification-Aware Regression Loss (CARL) を提案する。
推論オーバーヘッドを追加せずに PISA を二段階および単段検出器の両方に適用する。
PISA は COCO および VOC でランダムサンプリングおよびハードマイニングよりも利得を生むことを示す。

実験結果

リサーチクエスチョン

RQ1物体検出器の訓練において最も重要なサンプルは何か、そしてそれらの重要性をどのように定量化できるか。
RQ2訓練中にプライムサンプルを優先することは、従来のランダムサンプリングやハードマイニングより mAP を向上させるか。
RQ3分類とローカライゼーションを共同最適化して、プライムサンプルへの注意を強化するにはどうすればよいか。

主な発見

手法	バックボーン	AP	AP50	AP75	AP_S	AP_M	AP_L
Faster R-CNN	ResNet-50	36.7	58.8	39.6	21.6	39.8	44.9
Faster R-CNN	ResNeXt-101	40.3	62.7	44.0	24.4	43.7	49.8
Mask R-CNN	ResNet-50	37.5	59.4	40.7	22.1	40.6	46.2
Mask R-CNN	ResNeXt-101	41.4	63.4	45.2	24.5	44.9	51.8
Faster R-CNN w/ PISA	ResNet-50	38.8	59.3	42.7	22.1	41.7	48.8
Faster R-CNN w/ PISA	ResNeXt-101	42.3	62.9	46.8	24.8	45.5	53.1
Mask R-CNN w/ PISA	ResNet-50	39.3	59.6	43.5	22.1	42.3	49.4
Mask R-CNN w/ PISA	ResNeXt-101	42.9	63.2	47.4	24.9	46.2	54.0
RetinaNet	ResNet-50	37.3	56.5	40.3	20.3	40.4	47.2
RetinaNet w/ PISA	ResNet-50	37.3	56.5	40.3	20.3	40.4	47.2

PISA は COCO 上の Faster R-CNN、Mask R-CNN、 RetinaNet、SSD 系検出器（ResNet-50 や ResNeXt-101-32x4d などのバックボーン）で一貫して mAP を改善する。
COCO test-dev で、PISA は単段・二段階の検出器双方に対してベースラインより約 2% の絶対的 mAP 増分をもたらす。
PISA は正サンプルおよび負サンプルに対してランダムサンプリングおよびハードマイニングより良い性能を示し、特に IoU の閾値が高い場合（例：AP75）で顕著な向上を得る。
HLR に基づく順位付けは高 IoU の正サンプルを上位に、高スコアの負サンプルをそれぞれのランクリストの上位に配置し、訓練をプライムサンプルへ導く。
CARL は回帰損失を用いて分類スコアを調整することで分類と回帰を相関させ、プライムサンプルを強化する。
PISA は VOC07 でも改善を達成しており、データセット間の一般性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。