QUICK REVIEW

[论文解读] Double-Head RCNN: Rethinking Classification and Localization for Object Detection

Yue Wu, Yinpeng Chen|arXiv (Cornell University)|Apr 13, 2019

Advanced Neural Network Applications被引用 9

一句话总结

本文提出了一种名为 Double-Head R-CNN 的新型目标检测框架，通过为分类任务分配全连接头、为边界框回归任务分配卷积头，实现了分类与定位的解耦。该方法在使用 ResNet-50 和 ResNet-101 作为骨干网络的 FPN 基线基础上，分别在 MS COCO 数据集上实现了 +3.5 和 +2.8 的 AP 提升，充分利用了两种头结构的互补优势。

ABSTRACT

Two head structures (i.e. fully connected head and convolution head) have been widely used in R-CNN based detectors for classification and localization tasks. However, there is a lack of understanding of how does these two head structures work for these two tasks. To address this issue, we perform a thorough analysis and find an interesting fact that the two head structures have opposite preferences towards the two tasks. Specifically, the fully connected head (fc-head) is more suitable for the classification task, while the convolution head (conv-head) is more suitable for the localization task. Furthermore, we examine the output feature maps of both heads and find that fc-head has more spatial sensitivity than conv-head. Thus, fc-head has more capability to distinguish a complete object from part of an object, but is not robust to regress the whole object. Based upon these findings, we propose a Double-Head method, which has a fully connected head focusing on classification and a convolution head for bounding box regression. Without bells and whistles, our method gains +3.5 and +2.8 AP on MS COCO dataset from Feature Pyramid Network (FPN) baselines with ResNet-50 and ResNet-101 backbones, respectively.

研究动机与目标

研究全连接头与卷积头在基于 R-CNN 的检测器中，于分类与定位任务中的不同作用。
理解现有双头设计可能因头与任务分配不匹配而导致性能不佳的原因。
分析每种头结构的空间敏感性与特征表示能力。
基于实证发现，重新思考双头检测器中的头分配策略。
在不引入额外组件或复杂技巧的前提下，实现 MS COCO 上的最先进性能。

提出的方法

该方法提出了一种双头 R-CNN 架构，其中全连接头专用于分类任务，卷积头专用于边界框回归任务。
利用全连接头更高的空间敏感性，以更优地区分完整物体与部分物体。
由于卷积头在回归边界框坐标方面具有更强的鲁棒性，因此将其用于定位任务。
该设计在使用 ResNet-50 和 ResNet-101 作为骨干网络、FPN 作为颈部结构的 MS COCO 数据集上进行了验证。
未使用任何额外组件或训练技巧，确保性能提升完全源于架构层面的重新思考。

实验结果

研究问题

RQ1全连接头与卷积头在分类与定位任务中的偏好有何不同？
RQ2每种头结构的空间敏感性如何？其对目标识别与定位的影响是什么？
RQ3若将头重新分配至其更擅长的任务，是否能提升检测性能？
RQ4为何当前 R-CNN 检测器中的双头设计未能充分发挥每种头类型的潜力？
RQ5解耦头分配策略是否能在不同骨干网络架构上均带来一致的 AP 提升？

主要发现

全连接头表现出更高的空间敏感性，使其在区分完整物体与部分物体方面更为有效。
卷积头在边界框回归任务中更具鲁棒性，其在定位任务中的表现优于全连接头。
将全连接头重新分配给分类任务、卷积头重新分配给定位任务后，在使用 ResNet-50 骨干网络的 MS COCO 上实现了 +3.5 的 AP 提升。
在使用 ResNet-101 骨干网络时，该重新分配策略同样带来了 +2.8 的 AP 提升，表明其在不同架构间具有稳定的性能增益。
性能提升完全不依赖额外组件或训练修改，充分证明了架构重思考的有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。