How good are detection proposals, really? (1406.6962v2)

Published 26 Jun 2014 in cs.CV

Abstract: Current top performing Pascal VOC object detectors employ detection proposals to guide the search for objects thereby avoiding exhaustive sliding window search across images. Despite the popularity of detection proposals, it is unclear which trade-offs are made when using them during object detection. We provide an in depth analysis of ten object proposal methods along with four baselines regarding ground truth annotation recall (on Pascal VOC 2007 and ImageNet 2013), repeatability, and impact on DPM detector performance. Our findings show common weaknesses of existing methods, and provide insights to choose the most adequate method for different settings.

Citations (272)

View on Semantic Scholar

Summary

The paper's main contribution is a unified evaluation framework for detection proposals, emphasizing metrics like recall and repeatability.
It demonstrates that methods such as EdgeBoxes and MCG achieve higher recall and robustness, though many proposals are needed for high overlap thresholds.
The study reveals that using detection proposals can sometimes reduce DPM performance, highlighting the need for careful method selection in object detection tasks.

An Analysis and Evaluation of Object Detection Proposal Methods

The paper "How good are detection proposals, really?" by Jan Hosang, Rodrigo Benenson, and Bernt Schiele provides a thorough assessment of several object detection proposal methods, focusing on their recall, repeatability, and impact on detection performance. Object detection proposals have become an essential component in modern computer vision tasks, particularly as a strategy to avoid exhaustive sliding window searches, which can be computationally expensive. This paper critically examines ten existing proposal methods with a rigorous evaluation framework on widely used datasets such as Pascal VOC 2007 and ImageNet 2013.

Evaluation Framework and Contributions

The authors' contributions are multi-faceted: they offer a unified analysis of detection proposal methods, introduce repeatability as a metric, evaluate proposal overlaps with ground truth annotations, and observe the influence of proposals on Deformable Parts Models (DPM). These contributions provide the groundwork for understanding the trade-offs associated with various detection proposal methods.

Detection Proposal Methods

The evaluation encompasses a variety of proposal generation techniques, notably gPbUCM, Objectness, CPMC, Endres2010, Selective Search, Randomized Prim’s, Bing, MCG, Rantalankila2014, and EdgeBoxes. These methods range from those utilizing low-level image features to others employing advanced segmentation strategies and machine learning techniques. Interestingly, the majority of these methods rely on segmentation processes, particularly the watershed of low-level superpixels.

Insights on Repeatability

Repeatability is introduced as a critical metric to assess how consistent detection proposals are when images undergo slight perturbations such as blurring, scaling, rotation, illumination changes, and JPEG compression. The authors illustrate that methods based on superpixels, notably, suffer significant drops in repeatability with even minor perturbations, a crucial finding that challenges their robustness in practical settings. Notably, the methods Bing and EdgeBoxes exhibited superior repeatability, owing potentially to their machine learning components and less reliance on segmentation.

Ground Truth Recall and Detection Performance

The thorough analysis of proposal methods extends to their ground truth recall capabilities, where methods like MCG, EdgeBoxes, and Selective Search consistently demonstrate higher recall rates across both Pascal VOC and ImageNet datasets. The authors highlight EdgeBoxes as a strong performer, particularly when well-tuned for specific overlap thresholds, offering a good balance of speed and proposal localization quality. Importantly, the research reveals that a significant number of proposals are necessary for achieving acceptable recall rates, especially as the desired overlap threshold increases.

Furthermore, the paper investigates the impact of proposal methods on DPM object detectors and observes that using detection proposals, in general, may degrade detection performance compared to traditional sliding window approaches due to shifts in negative sample distributions. However, methods with better localization (e.g., MCG and EdgeBoxes) entail lesser drops in performance.

Implications and Future Work

The implications of this paper suggest that while detection proposals have the potential to enhance object detection frameworks by reducing false positives and computational load, selecting the appropriate proposal method is crucial and task-dependent. The authors stress the potential for further optimizing superpixel stability as a means to improve method repeatability. They also emphasize transferability across datasets, as evidenced by nearly equivalent performance on ImageNet despite the increased category diversity.

In conclusion, this paper provides foundational insights and a comprehensive evaluation that can serve as a guideline for researchers and practitioners when choosing or developing detection proposal methods. Future developments might focus on increasing the robustness of proposals, enhancing the repeatability of methods, especially those based on segmentation techniques, and investigating alternative approaches that can replace the potential "transition technology" nature of current detection proposals.

PDF Markdown