QUICK REVIEW

[论文解读] Libra R-CNN: Towards Balanced Learning for Object Detection

Jiangmiao Pang, Kai Chen|arXiv (Cornell University)|Apr 4, 2019

Advanced Neural Network Applications参考文献 34被引用 138

一句话总结

Libra R-CNN 引入 IoU 平衡采样、平衡特征金字塔和平衡 L1 损失以解决训练时的不平衡性，从而在 COCO 上的 AP 相较基线有所提升。

ABSTRACT

Compared with model architectures, the training process, which is also crucial to the success of detectors, has received relatively less attention in object detection. In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level. To mitigate the adverse effects caused thereby, we propose Libra R-CNN, a simple but effective framework towards balanced learning for object detection. It integrates three novel components: IoU-balanced sampling, balanced feature pyramid, and balanced L1 loss, respectively for reducing the imbalance at sample, feature, and objective level. Benefitted from the overall balanced design, Libra R-CNN significantly improves the detection performance. Without bells and whistles, it achieves 2.5 points and 2.0 points higher Average Precision (AP) than FPN Faster R-CNN and RetinaNet respectively on MSCOCO.

研究动机与目标

识别并量化对象检测器在样本、特征和目标层面的训练时不平衡性。
提出一个平衡学习框架（IoU 平衡采样、平衡特征金字塔、平衡 L1 损失）以缓解这些不平衡性。
在 MS COCO 的两阶段和单阶段检测器上使用标准骨干网络证明显著的 AP 增益。
展示所提组件协同提升定位与识别精度。

提出的方法

IoU 平衡采样通过 IoU 分布在不增加额外成本的情况下优先处理困难的负样本/正样本。
平衡特征金字塔通过对分辨率之间的信息进行均衡，从而整合多层特征。
平衡 L1 损失在联合分类与定位任务中促进关键回归梯度、控制异常值的影响。

实验结果

研究问题

RQ1当前对象检测器在样本、特征和目标层面存在哪些训练时不平衡性？
RQ2是否可以通过有意的平衡训练框架在不改变复杂架构的情况下同时改进定位与识别？
RQ3IoU 平衡采样、平衡特征金字塔与平衡 L1 损失结合时是否提供互补性的增益？
RQ4这些组件在不同骨干网络下对标准基准（如 MS COCO）性能的影响如何？

主要发现

方法	骨干网络	调度	AP	AP50	AP75	AP_S	AP_M	AP_L
YOLOv2	DarkNet-19	-	21.6	44.0	19.2	5.0	22.4	35.5
SSD512	ResNet-101	-	31.2	50.4	33.3	10.2	34.5	49.8
RetinaNet	ResNet-101-FPN	-	39.1	59.1	42.3	21.8	42.7	50.2
Faster R-CNN	ResNet-101-FPN	-	36.2	59.1	39.0	18.2	39.0	48.2
Deformable R-FCN	Inception-ResNet-v2	-	37.5	58.0	40.8	19.4	40.1	52.5
Mask R-CNN	ResNet-101-FPN	-	38.2	60.3	41.7	20.1	41.1	50.2
Faster R-CNN*	ResNet-50-FPN	1x	36.2	58.5	38.9	21.0	38.9	45.3
Faster R-CNN*	ResNet-101-FPN	1x	38.8	60.9	42.1	22.6	42.4	48.5
Faster R-CNN*	ResNet-101-FPN	2x	39.7	61.3	43.4	22.1	43.1	50.3
Faster R-CNN*	ResNeXt-101-FPN	1x	41.9	63.9	45.9	25.0	45.3	52.3
RetinaNet*	ResNet-50-FPN	1x	35.8	55.3	38.6	20.0	39.0	45.1
Libra R-CNN	ResNet-50-FPN	1x	38.7	59.9	42.0	22.5	41.1	48.7
Libra R-CNN	ResNet-101-FPN	1x	40.3	61.3	43.9	22.9	43.1	51.0
Libra R-CNN	ResNet-101-FPN	2x	41.1	62.1	44.7	23.4	43.7	52.5
Libra R-CNN	ResNeXt-101-FPN	1x	43.0	64.0	47.0	25.3	45.6	54.6
Libra RetinaNet	ResNet-50-FPN	1x	37.8	56.9	40.5	21.2	40.9	47.7

Libra R-CNN 在 COCO 上相对基线取得显著的 AP 增益，例如在 ResNet-50 的 FPN Faster R-CNN 上提升 2.5 AP，在 RetinaNet 上提升 2.0 AP。
IoU 平衡采样在 val-2017 基线下 AP 提升可达约 0.9 点。
平衡特征金字塔在小/中/大目标上均带来一致的增益，并且可与 PAFPN 互补。
平衡 L1 损失在定位方面提升，特别是 AP75，通过在内点与外点之间平衡梯度来实现。
在更强的骨干网络（如 ResNeXt-101-FPN）下，Libra R-CNN 可达到 43.0 AP，优于若干前沿检测器。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。