QUICK REVIEW

[Paper Review] CornerNet-Lite: Efficient Keypoint Based Object Detection

Hei Law, Yun Teng|arXiv (Cornell University)|Apr 18, 2019

Advanced Neural Network Applications66 references164 citations

TL;DR

CornerNet-Lite combines CornerNet-Saccade and CornerNet-Squeeze to improve efficiency for keypoint-based detection, achieving faster inference with competitive or better AP than prior real-time detectors on COCO.

ABSTRACT

Keypoint-based methods are a relatively new paradigm in object detection, eliminating the need for anchor boxes and offering a simplified detection framework. Keypoint-based CornerNet achieves state of the art accuracy among single-stage detectors. However, this accuracy comes at high processing cost. In this work, we tackle the problem of efficient keypoint-based object detection and introduce CornerNet-Lite. CornerNet-Lite is a combination of two efficient variants of CornerNet: CornerNet-Saccade, which uses an attention mechanism to eliminate the need for exhaustively processing all pixels of the image, and CornerNet-Squeeze, which introduces a new compact backbone architecture. Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency. CornerNet-Saccade is suitable for offline processing, improving the efficiency of CornerNet by 6.0x and the AP by 1.0% on COCO. CornerNet-Squeeze is suitable for real-time detection, improving both the efficiency and accuracy of the popular real-time detector YOLOv3 (34.4% AP at 30ms for CornerNet-Squeeze compared to 33.0% AP at 39ms for YOLOv3 on COCO). Together these contributions for the first time reveal the potential of keypoint-based detection to be useful for applications requiring processing efficiency.

Motivation & Objective

Motivate and address the speed-accuracy tradeoff in keypoint-based object detection without anchors.
Propose two efficient variants of CornerNet to improve offline and real-time performance.
Demonstrate that saccades and a compact backbone can yield significant speedups with minimal AP loss or even gains.
Evaluate CornerNet-Lite on COCO to compare against YOLOv3 and CornerNet.
Highlight practical training and architectural adaptations that enable real-time or near-real-time inference.

Proposed method

Introduce CornerNet-Saccade that uses an attention-based downscaled pass to propose object locations and then processes selected high-resolution crops in parallel.
Develop CornerNet-Squeeze with a compact hourglass backbone inspired by SqueezeNet and MobileNets to reduce per-pixel computation.
Employ an hourglass-54 backbone for Saccade to balance depth and efficiency.
Use Soft-NMS and boundary-crop suppression to handle partial objects and overlapping crops.
Train using the same CornerNet losses for corner heatmaps, embeddings, and offsets across variants.
Compare inference time and accuracy on COCO using a consistent hardware setup.

Experimental results

Research questions

RQ1Can a saccade-like attention mechanism reduce the number of pixels processed without sacrificing CornerNet accuracy?
RQ2Can a compact backbone (CornerNet-Squeeze) provide real-time performance while maintaining or improving AP?
RQ3How do CornerNet-Saccade and CornerNet-Squeeze compare against the original CornerNet and YOLOv3 in terms of speed and accuracy on COCO?
RQ4Is combining saccades with the ultra-compact backbone beneficial or detrimental for real-time detection?
RQ5What are the trade-offs in training efficiency and memory usage for these variants?

Key findings

CornerNet-Saccade achieves 6x speed-up over CornerNet with a 1% AP increase on COCO (AP from 42.2% to 43.2%).
CornerNet-Squeeze achieves 34.4% AP at 30ms, outperforming YOLOv3 (33.0% at 39ms) on COCO.
CornerNet-Lite improves offline efficiency while maintaining high accuracy, and enables real-time performance with competitive AP.
CornerNet-Saccade uses a downsized image to predict attention maps for multiple object locations across sizes (small, medium, large).
A combined CornerNet-Saccade-Squeeze without attention yields worse performance due to capacity limits; saccades need sufficiently accurate attention maps to help.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.