[Paper Review] CornerNet: Detecting Objects as Paired Keypoints
CornerNet detects objects as pairs of corners (top-left and bottom-right) using a single network with corner pooling and associative embeddings, achieving strong COCO one-stage results without anchor boxes.
We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors. In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners. Experiments show that CornerNet achieves a 42.2% AP on MS COCO, outperforming all existing one-stage detectors.
Motivation & Objective
- Motivate removing anchor boxes from one-stage detectors due to inefficiencies and design complexity.
- Propose detecting objects as paired keypoints (top-left and bottom-right corners) with category-specific heatmaps.
- Introduce corner pooling to improve corner localization where local evidence is weak.
- Develop associative embeddings to group corner pairs belonging to the same object.
- Demonstrate state-of-the-art one-stage performance on MS COCO and provide ablations of key components.
Proposed method
- Predict two heatmaps per category: one for top-left corners and one for bottom-right corners.
- Predict a 1D embedding per detected corner to group paired corners of the same object via pull/push losses.
- Use corner offsets to refine corner locations after downsampling remapping.
- Propose corner pooling to aggregate far-field boundary information by horizontal and vertical max-pooling and summing results.
- Adopt hourglass network as backbone with a tailored prediction module for heatmaps, embeddings, and offsets.
- Train with a variant of focal loss and an object-dependent radius for down-weighting nearby negatives.
Experimental results
Research questions
- RQ1Can objects be accurately detected by pairing corner keypoints instead of using anchor boxes?
- RQ2Does corner pooling improve the localization of bounding box corners and overall detection accuracy?
- RQ3How effective are associative embeddings in correctly grouping paired corners from the same object?
- RQ4What is the impact of learning corner offsets and the modified loss terms on COCO performance?
Key findings
- Corner pooling significantly improves AP by about 2.0 points on COCO validation.
- Using object-dependent penalty reduction for negative locations yields notable AP gains over fixed-radius strategies.
- Corner pooling enhances performance for medium and large objects more than small ones.
- Hourglass backbone with corner-based predictions outperforms FPN-based backbones and anchor-box detectors in AP.
- On COCO test-dev, CornerNet surpasses all one-stage detectors and competes with many two-stage detectors.
- GT heatmaps alone suggest detection of corners is the main bottleneck, with ~73.1 AP when provided with ground-truth heatmaps.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.