QUICK REVIEW

[论文解读] CenterNet: Keypoint Triplets for Object Detection

Kaiwen Duan, Song Bai|arXiv (Cornell University)|Apr 17, 2019

Advanced Neural Network Applications参考文献 45被引用 164

一句话总结

CenterNet 将每个对象检测为一个由中心点、左上角和右下角三点组成的三点集合，并使用中心池化和级联角点池化来减少误检，在 COCO 上实现一阶段的最先进 AP（47.0），并在两阶段结果上具有竞争力。

ABSTRACT

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions. This paper presents an efficient solution which explores the visual patterns within each cropped region with minimal costs. We build our framework upon a representative one-stage keypoint-based detector named CornerNet. Our approach, named CenterNet, detects each object as a triplet, rather than a pair, of keypoints, which improves both precision and recall. Accordingly, we design two customized modules named cascade corner pooling and center pooling, which play the roles of enriching information collected by both top-left and bottom-right corners and providing more recognizable information at the central regions, respectively. On the MS-COCO dataset, CenterNet achieves an AP of 47.0%, which outperforms all existing one-stage detectors by at least 4.9%. Meanwhile, with a faster inference speed, CenterNet demonstrates quite comparable performance to the top-ranked two-stage detectors. Code is available at https://github.com/Duankaiwen/CenterNet.

研究动机与目标

通过利用内部对象区域模式来推动改进一阶段基于关键点的目标检测器。
在 CornerNet 的基础上引入中心关键点，以形成一个关键点三元组用于鲁棒的对象表示。
通过中心池化和级联角点池化丰富中心和角点特征，以提高精度和召回率。
在 MS-COCO 上评估 CenterNet，以量化在不同对象尺度的 AP 与 AR 的提升，并与最先进检测器进行比较。

提出的方法

将每个对象表示为由中心关键点和一对角点组成的三元组。
像 CornerNet 一样预测中心热力图和角点热力图以及嵌入和偏移，然后从角点对形成边界框。
对每个候选边界框定义一个按尺度感知的中心区域，并要求该区域内同一类别的中心关键点以验证该框。
引入中心池化，通过在水平方向和垂直方向聚合最大响应来增强中心关键点。
引入级联角点池化，通过结合边界和内部方向的最大响应来丰富角点特征。
使用多项损失进行训练，包括角点和中心的 focal 损失、拉/推嵌入损失以及偏移损失；推理时进行中心验证和非极大值抑制。

实验结果

研究问题

RQ1中心区域内的中心关键点是否能提高基于角点的对象检测的正确性？
RQ2通过中心池化和级联角点池化丰富中心与角点信息，是否提升 COCO 的 AP 与 AR？
RQ3与 CornerNet 及其他最先进检测器相比，CenterNet 在 MS-COCO 上的表现如何？
RQ4尺度感知的中心区域对小对象和大对象的检测有何影响？

主要发现

方法	骨干	训练输入	测试输入	AP	AP 50	AP 75	AP S	AP M	AP L	AR 1	AR 10	AR 100	AR S	AR M	AR L
CornerNet511-52	Hourglass-52	511×511	ori.	37.8	53.7	40.1	17.0	39.0	50.5	33.9	52.3	57.0	35.0	59.3	74.7
CornerNet511-104	Hourglass-104	511×511	ori.	40.5	56.5	43.1	19.4	42.7	53.9	35.3	54.3	59.1	37.4	61.9	76.9
CornerNet511 (multi-scale)	Hourglass-52	511×511	<=1.5×	39.4	54.9	42.3	18.9	41.2	53.5	35.0	53.5	57.7	36.1	60.1	75.1
CornerNet511 (multi-scale)	Hourglass-104	511×511	<=1.5×	42.1	57.8	45.3	20.8	44.8	56.7	36.4	55.7	60.0	38.5	62.7	77.4
CenterNet511-52	Hourglass-52	511×511	ori.	41.6	59.4	44.2	22.5	43.1	54.1	34.8	55.7	60.1	38.6	63.3	76.9
CenterNet511-104	Hourglass-104	511×511	ori.	44.9	62.4	48.1	25.6	47.4	57.4	36.1	58.4	63.3	41.3	67.1	80.2
CenterNet511 (multi-scale)	Hourglass-52	511×511	<=1.8×	43.5	61.3	46.7	25.3	45.3	55.0	36.0	57.2	61.3	41.4	64.0	76.3
CenterNet511 (multi-scale)	Hourglass-104	511×511	<=1.8×	47.0	64.5	50.7	28.9	49.9	58.9	37.5	60.3	64.8	45.1	68.3	79.7

CenterNet 在 COCO test-dev 上实现 47.0% 的 AP，使用 CenterNet102/104 骨干网和多尺度测试，领先所有现有的一阶段检测器至少 4.9% AP。
与 CornerNet 相比，CenterNet 在小对象上显著减少错误边界框（错误发现）。
使用中心池化和级联角点池化的 CenterNet 相比基线 CornerNet 在 AP 与 AR 有提升，且对小对象和大对象的提升取决于骨干网和尺度。
尺度感知的中心区域在小框的召回率上有所提升，同时对大框保持了精度。
CenterNet 的单尺度 AP 使用 Hourglass-104 达到 44.9%（单尺度）和 47.4%（多尺度），而多尺度 CenterNet 使用 Hourglass-104 达到 47.0% AP，与顶尖的两阶段检测器相竞争。
推理速度仍然实用（每张图像 270–340 ms），并带来相对于基线的显著精度提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。