QUICK REVIEW

[论文解读] Prime Sample Attention in Object Detection

Yuhang Cao, Kai Chen|arXiv (Cornell University)|Apr 9, 2019

Advanced Neural Network Applications参考文献 35被引用 44

一句话总结

论文介绍 Prime Sample Attention (PISA)，一种在训练中通过 Hierarchical Local Rank (HLR) 和一个分类感知回归损失，聚焦于 prime samples（高影响力正/负建议）来提高 mAP，在 COCO 和 VOC 基准测试中优于随机采样和困难挖掘。

ABSTRACT

It is a common paradigm in object detection frameworks to treat all samples equally and target at maximizing the performance on average. In this work, we revisit this paradigm through a careful study on how different samples contribute to the overall performance measured in terms of mAP. Our study suggests that the samples in each mini-batch are neither independent nor equally important, and therefore a better classifier on average does not necessarily mean higher mAP. Motivated by this study, we propose the notion of Prime Samples, those that play a key role in driving the detection performance. We further develop a simple yet effective sampling and learning strategy called PrIme Sample Attention (PISA) that directs the focus of the training process towards such samples. Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector. Particularly, On the MSCOCO dataset, PISA outperforms the random sampling baseline and hard mining schemes, e.g., OHEM and Focal Loss, consistently by around 2% on both single-stage and two-stage detectors, even with a strong backbone ResNeXt-101.

研究动机与目标

质疑假设：并非所有小批量样本对 mAP 的贡献都是相等的。
识别哪些样本对检测性能影响最大，以及如何对它们进行排序。
提出一种实用的采样与损失策略，在训练中强调 prime samples。
在 COCO 和 VOC 上展示对两阶段与单阶段检测器的改进。

提出的方法

将 Prime Samples 定义为对检测性能影响最大的样本。
引入层次化局部排序（HLR）在小批量中按 IoU 对正样本排序、按分数对负样本排序。
开发基于重要性的样本重加权（ISR），将 HLR 排名转化为正负样本的损失权重。
提出分类感知回归损失（CARL），实现分类与回归的联合优化并结合样本感知权重。
将 PISA 应用于两阶段和单阶段检测器，且不增加推理开销。
证明 PISA 在 COCO 和 VOC 上相对于随机采样与难挖更有提升。

实验结果

研究问题

RQ1训练对象检测器时，哪些样本最重要，如何量化它们的重要性？
RQ2在训练中优先考虑 prime samples 是否比传统的随机采样或困难挖掘更能提升 mAP？
RQ3如何联合优化分类与定位以强化对 prime samples 的关注？

主要发现

Method	Backbone	AP	AP50	AP75	AP_S	AP_M	AP_L
Faster R-CNN	ResNet-50	36.7	58.8	39.6	21.6	39.8	44.9
Faster R-CNN	ResNeXt-101	40.3	62.7	44.0	24.4	43.7	49.8
Mask R-CNN	ResNet-50	37.5	59.4	40.7	22.1	40.6	46.2
Mask R-CNN	ResNeXt-101	41.4	63.4	45.2	24.5	44.9	51.8
Faster R-CNN w/ PISA	ResNet-50	38.8	59.3	42.7	22.1	41.7	48.8
Faster R-CNN w/ PISA	ResNeXt-101	42.3	62.9	46.8	24.8	45.5	53.1
Mask R-CNN w/ PISA	ResNet-50	39.3	59.6	43.5	22.1	42.3	49.4
Mask R-CNN w/ PISA	ResNeXt-101	42.9	63.2	47.4	24.9	46.2	54.0
RetinaNet	ResNet-50	37.3	56.5	40.3	20.3	40.4	47.2
RetinaNet w/ PISA	ResNet-50	37.3	56.5	40.3	20.3	40.4	47.2

PISA 在 COCO 数据集上对 Faster R-CNN、Mask R-CNN、RetinaNet 以及基于 SSD 的检测器均能提升 mAP，使用如 ResNet-50 和 ResNeXt-101-32x4d 等骨干网络。
在 COCO test-dev 上，PISA 对单阶段和两阶段检测器相对于基线带来约 2% 的绝对 mAP 提升。
PISA 在正样本和负样本上均优于随机采样与难挖，且在 IoU 阈值较高时（如 AP75）提升显著。
基于 HLR 的排序将高 IoU 的正样本置于排名靠前的位置，及将高分的负样本置于其各自的排名列表顶部，引导训练聚焦于 prime samples。
CARL 通过使用回归损失来调制分类分数，从而实现分类与回归之间的相关性，提升 prime samples 的表现。
PISA 同样在 VOC07 上取得改进，表明对不同数据集具有广泛性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。