QUICK REVIEW

[论文解读] Rethinking Classification and Localization for Object Detection

Yue Wu, Yinpeng Chen|arXiv (Cornell University)|Apr 13, 2019

Advanced Neural Network Applications参考文献 47被引用 40

一句话总结

该论文分析全连接和卷积检测头在分类和定位上的不同影响，并提出 Double-Head 检 detector，将 fc-head 用于分类、conv-head 用于边框回归，在 COCO 上取得显著的 AP 提升。

ABSTRACT

Two head structures (i.e. fully connected head and convolution head) have been widely used in R-CNN based detectors for classification and localization tasks. However, there is a lack of understanding of how does these two head structures work for these two tasks. To address this issue, we perform a thorough analysis and find an interesting fact that the two head structures have opposite preferences towards the two tasks. Specifically, the fully connected head (fc-head) is more suitable for the classification task, while the convolution head (conv-head) is more suitable for the localization task. Furthermore, we examine the output feature maps of both heads and find that fc-head has more spatial sensitivity than conv-head. Thus, fc-head has more capability to distinguish a complete object from part of an object, but is not robust to regress the whole object. Based upon these findings, we propose a Double-Head method, which has a fully connected head focusing on classification and a convolution head for bounding box regression. Without bells and whistles, our method gains +3.5 and +2.8 AP on MS COCO dataset from Feature Pyramid Network (FPN) baselines with ResNet-50 and ResNet-101 backbones, respectively.

研究动机与目标

理解 fc-head 与 conv-head 如何影响两阶段检测器中的分类与定位。
在 MS COCO 2017 验证集上使用预定义 proposal 进行经验比较 fc-head 与 conv-head。
识别两种头部的互补优点与局限性。
提出联合架构（Double-Head），同时利用两种头部以提升检测性能。
探索扩展，利用未聚焦任务进一步提升准确性。

提出的方法

在带有 ResNet-50 的 FPN 上训练并比较 fc-head 与 conv-head，以评估分类 vs 定位性能。
分析输出特征图，衡量空间敏感性以及与 IoU 的相关性。
提出 Double-Head 架构：fc-head 用于分类，conv-head 用于边框回归。
扩展到 Double-Head-Ext，通过引入未聚焦任务监督和推理阶段的分类器融合。
在 COCO 和 VOC07 上进行评估，并对骨干网络与头部配置进行消融实验。

实验结果

研究问题

RQ1fc-head 与 conv-head 是否在分类与定位方面具有互补优势？
RQ2fc-head 与 conv-head 的空间敏感性有何差异，它如何影响 IoU 的相关性？
RQ3将任务分离到两个头部是否能在总体检测性能上优于单头基线？
RQ4引入未聚焦任务和分类器融合是否进一步提升准确性？

主要发现

Method	Backbone	AP	AP 0.5	AP 0.75	AP S	AP M	AP L
FPN baseline	ResNet-50	36.8	58.7	40.4	21.2	40.1	48.8
Double-Head	ResNet-50	39.8	59.6	43.6	22.7	42.9	53.1
Double-Head-Ext	ResNet-50	40.3	60.3	44.2	22.4	43.3	54.3
FPN baseline	ResNet-101	39.1	61.0	42.4	22.2	42.5	51.0
Double-Head	ResNet-101	41.5	61.7	45.6	23.8	45.2	54.9
Double-Head-Ext	ResNet-101	41.9	62.4	45.9	23.9	45.2	55.8

fc-head 提供的分类分数在 IoU 上的相关性高于 conv-head，且对小对象尤其显著。
conv-head 在边框回归方面比 fc-head 更精准。
Double-Head（fc-head 用于分类、conv-head 用于回归）在 COCO 上使用 ResNet-50 与 ResNet-101 骨干时，超过单头基线。
Double-Head-Ext 通过监督未聚焦任务并融合分类器，进一步提升结果，在一个训练阶段的 COCO test-dev 上达到接近 state-of-the-art 的增益。
在 VOC07 上，Double-Head-Ext 在 AP、AP@0.5 与 AP@0.75 等指标上明显超越 FPN 基线。
以 COCO val2017 的结果为例，Double-Head-Ext 在 (ResNet-101) 达到 42.3 AP，与不同阈值下的 49+% AP 相比显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。