Skip to main content
QUICK REVIEW

[Paper Review] How good are detection proposals, really?

Jan Hosang, Rodrigo Benenson|arXiv (Cornell University)|Jun 26, 2014
Advanced Image and Video Retrieval Techniques25 references56 citations
TL;DR

This paper evaluates ten object proposal methods for object detection using ground truth recall, repeatability, and impact on DPM detector performance across Pascal VOC 2007 and ImageNet 2013. It finds that Selective Search and EdgeBoxes offer the best balance of recall, repeatability, and speed, while most methods suffer from low repeatability due to unstable superpixels, and that proposal quality significantly affects final detection accuracy.

ABSTRACT

Current top performing Pascal VOC object detectors employ detection proposals to guide the search for objects thereby avoiding exhaustive sliding window search across images. Despite the popularity of detection proposals, it is unclear which trade-offs are made when using them during object detection. We provide an in depth analysis of ten object proposal methods along with four baselines regarding ground truth annotation recall (on Pascal VOC 2007 and ImageNet 2013), repeatability, and impact on DPM detector performance. Our findings show common weaknesses of existing methods, and provide insights to choose the most adequate method for different settings.

Motivation & Objective

  • To provide a systematic, unbiased comparison of existing detection proposal methods in a unified evaluation framework.
  • To analyze the trade-offs between proposal quality, speed, and repeatability in object detection pipelines.
  • To assess generalization beyond Pascal VOC by evaluating on the larger, more diverse ImageNet 2013 validation set.
  • To quantify the impact of proposal methods on final detector performance using DPM as a baseline.
  • To release all bounding boxes and evaluation scripts to support reproducibility and future method comparisons.

Proposed method

  • The authors evaluate ten publicly available detection proposal methods, including Selective Search, EdgeBoxes, MCG, and CPMC, using a common evaluation pipeline.
  • Ground truth recall is measured on Pascal VOC 2007 and ImageNet 2013 using IoU thresholds (e.g., ≥0.5, ≥0.7) to assess localization accuracy.
  • Repeatability is introduced as a new metric, measuring consistency of proposals under image perturbations such as noise and blurring.
  • A DPM detector is applied to each proposal window, followed by non-maximum suppression and bounding box regression to evaluate final detection performance.
  • The evaluation uses 1,000 proposals per image and compares mAP and per-class detection metrics across methods.
  • All experiments are conducted on more than 2.5 months of CPU computation, with results and code publicly released.

Experimental results

Research questions

  • RQ1How do different detection proposal methods compare in terms of ground truth recall on Pascal VOC 2007 and ImageNet 2013?
  • RQ2What is the repeatability of detection proposals under image perturbations, and how does it affect detection reliability?
  • RQ3How do proposal methods influence the final mAP of a DPM detector, and which methods yield the best detection performance?
  • RQ4Do proposal methods generalize well beyond Pascal VOC, or are they biased toward specific object categories?
  • RQ5Which method offers the best trade-off between speed, recall, and detection quality in practice?

Key findings

  • Selective Search and EdgeBoxes achieve the highest balance of ground truth recall (above 69% AUC), repeatability, and speed, making them the top-performing methods.
  • MCG achieves the highest recall (over 70% at IoU ≥0.5) when using 1,000 proposals, but is slower than EdgeBoxes and Selective Search.
  • EdgeBoxes provides the best compromise between speed and quality when fewer than 1,000 proposals are needed.
  • Objectness and Bing have lower mAP than Rahtu and Gaussian baselines despite similar AUC, due to poorer localization in high-IoU regions.
  • Most methods suffer from low repeatability due to instability in superpixel or boundary estimation, even under minor image perturbations.
  • ImageNet 2013 evaluation confirms that most methods generalize well beyond Pascal VOC, supporting their use as true 'objectness' methods.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.