QUICK REVIEW

[Paper Review] CornerNet: Detecting Objects as Paired Keypoints

Hei Law, Jia Deng|arXiv (Cornell University)|Aug 3, 2018

Advanced Neural Network Applications48 references147 citations

TL;DR

CornerNet detects objects as pairs of corners (top-left and bottom-right) using a single network with corner pooling and associative embeddings, achieving strong COCO one-stage results without anchor boxes.

ABSTRACT

We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors. In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners. Experiments show that CornerNet achieves a 42.2% AP on MS COCO, outperforming all existing one-stage detectors.

Motivation & Objective

Motivate removing anchor boxes from one-stage detectors due to inefficiencies and design complexity.
Propose detecting objects as paired keypoints (top-left and bottom-right corners) with category-specific heatmaps.
Introduce corner pooling to improve corner localization where local evidence is weak.
Develop associative embeddings to group corner pairs belonging to the same object.
Demonstrate state-of-the-art one-stage performance on MS COCO and provide ablations of key components.

Proposed method

Predict two heatmaps per category: one for top-left corners and one for bottom-right corners.
Predict a 1D embedding per detected corner to group paired corners of the same object via pull/push losses.
Use corner offsets to refine corner locations after downsampling remapping.
Propose corner pooling to aggregate far-field boundary information by horizontal and vertical max-pooling and summing results.
Adopt hourglass network as backbone with a tailored prediction module for heatmaps, embeddings, and offsets.
Train with a variant of focal loss and an object-dependent radius for down-weighting nearby negatives.

Experimental results

Research questions

RQ1Can objects be accurately detected by pairing corner keypoints instead of using anchor boxes?
RQ2Does corner pooling improve the localization of bounding box corners and overall detection accuracy?
RQ3How effective are associative embeddings in correctly grouping paired corners from the same object?
RQ4What is the impact of learning corner offsets and the modified loss terms on COCO performance?

Key findings

Corner pooling significantly improves AP by about 2.0 points on COCO validation.
Using object-dependent penalty reduction for negative locations yields notable AP gains over fixed-radius strategies.
Corner pooling enhances performance for medium and large objects more than small ones.
Hourglass backbone with corner-based predictions outperforms FPN-based backbones and anchor-box detectors in AP.
On COCO test-dev, CornerNet surpasses all one-stage detectors and competes with many two-stage detectors.
GT heatmaps alone suggest detection of corners is the main bottleneck, with ~73.1 AP when provided with ground-truth heatmaps.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.