CenterNet: Keypoint Triplets for Object Detection (1904.08189v3)

Published 17 Apr 2019 in cs.CV

Abstract: In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions. This paper presents an efficient solution which explores the visual patterns within each cropped region with minimal costs. We build our framework upon a representative one-stage keypoint-based detector named CornerNet. Our approach, named CenterNet, detects each object as a triplet, rather than a pair, of keypoints, which improves both precision and recall. Accordingly, we design two customized modules named cascade corner pooling and center pooling, which play the roles of enriching information collected by both top-left and bottom-right corners and providing more recognizable information at the central regions, respectively. On the MS-COCO dataset, CenterNet achieves an AP of 47.0%, which outperforms all existing one-stage detectors by at least 4.9%. Meanwhile, with a faster inference speed, CenterNet demonstrates quite comparable performance to the top-ranked two-stage detectors. Code is available at https://github.com/Duankaiwen/CenterNet.

Citations (2,434)

View on Semantic Scholar

Summary

The paper introduces a novel center keypoint, forming triplets that validate bounding boxes and enhance detection accuracy.
The authors employ specialized pooling modules, including center pooling and cascade corner pooling, to improve feature extraction with minimal overhead.
Empirical results on MS-COCO show up to a 4.9% AP improvement, significantly reducing false detections compared to traditional one-stage detectors.

An Analytical Overview of "CenterNet: Keypoint Triplets for Object Detection"

The paper "CenterNet: Keypoint Triplets for Object Detection" by Kaiwen Duan et al. presents an innovative approach to object detection by extending the CornerNet framework to include a third keypoint at the center of objects. This paper critically addresses the limitations inherent in current one-stage keypoint-based detectors, particularly the propensity for generating incorrect bounding boxes.

Methodology and Contributions

Building on the foundational CornerNet, which employs a pair of corner keypoints to represent objects, CenterNet introduces a third central keypoint. This triplet formation substantially improves object detection accuracy. The authors enhance the network's discriminatory capacity with minimal computational overhead by:

Center Keypoint Detection: Introducing a center keypoint within the predicted bounding box's central region allows the network to validate the bounding box's accuracy.
Customized Pooling Modules:
- Center Pooling: This module consolidates visual information horizontally and vertically around the center point, reinforcing the recognition of central visual patterns.
- Cascade Corner Pooling: This enhancement extends the original corner pooling by incorporating both boundary and internal object information, thus stabilizing feature extraction even in the presence of noise.

Empirical Evaluation

The CenterNet framework's efficacy is demonstrated through rigorous testing on the MS-COCO dataset, which is a standard benchmark for large-scale object detection. The results are substantial:

CenterNet achieved an Average Precision (AP) of 47.0% on the MS-COCO test-dev subset, representing a notable improvement (up to 4.9%) over existing one-stage detectors.
Utilizing an Hourglass-52 backbone, CenterNet delivers a single-scale testing AP of 41.6% and a multi-scale testing AP of 43.5%. With the deeper Hourglass-104 backbone, it achieved 44.9% single-scale and 47.0% multi-scale testing AP, showcasing competitive performance even against leading two-stage detectors.
The paper indicates that CenterNet significantly reduces False Discovery rates (FDR), particularly for small objects, affirming its enhanced precision due to incorporating center keypoints.

Theoretical and Practical Implications

Theoretically, CenterNet redefines the object representation model in one-stage detectors by leveraging keypoint triplets, which dramatically synergizes central and boundary information. This marks a pivotal advancement in the discriminative power of object detectors, largely borrowed from two-stage methodologies but executed with the efficiency of one-stage frameworks.

Practically, the CenterNet framework can be generalized to enhance other one-stage approaches, potentially leading to broader applications in real-time object detection tasks where computational efficiency and high precision are paramount. Moreover, the nuanced approach of adding central keypoints can be hybridized with cutting-edge architectures and training strategies to further magnify detection accuracy and robustness.

Future Developments

Further developments could explore more sophisticated pooling functions or advanced training regimes, optimizing the triplet-based detection strategy. Integration with other deep learning innovations, such as Transformer models for visual tasks, could also be a prolific direction, promising to maximize CenterNet's applicability and performance in diverse environments.

In conclusion, CenterNet's contribution to object detection signifies a thorough enhancement on the cornerstone laid by CornerNet. Equipped with the novel triplet keypoint representation, CenterNet demonstrates superior numerical performance, reduced incorrect detections, and streamlined computational efficiency, rendering it a formidable model in the continuous evolution of object detection frameworks.

PDF Markdown

Related Papers

Objects as Points (2019)
CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection (2020)
CenterNet3D: An Anchor Free Object Detector for Point Cloud (2020)
CornerNet: Detecting Objects as Paired Keypoints (2018)
CenterNet++ for Object Detection (2022)

GitHub

GitHub - Duankaiwen/CenterNet: Codes for our paper "CenterNet: Keypoint Triplets for Object Detection" . (1,878 stars)

Tweets

https://twitter.com/jpclap/status/1831597619400872352

YouTube

Show All Videos