CenterNet3D: An Anchor Free Object Detector for Point Cloud (2007.07214v4)

Published 13 Jul 2020 in cs.CV

Abstract: Accurate and fast 3D object detection from point clouds is a key task in autonomous driving. Existing one-stage 3D object detection methods can achieve real-time performance, however, they are dominated by anchor-based detectors which are inefficient and require additional post-processing. In this paper, we eliminate anchors and model an object as a single point--the center point of its bounding box. Based on the center point, we propose an anchor-free CenterNet3D network that performs 3D object detection without anchors. Our CenterNet3D uses keypoint estimation to find center points and directly regresses 3D bounding boxes. However, because inherent sparsity of point clouds, 3D object center points are likely to be in empty space which makes it difficult to estimate accurate boundaries. To solve this issue, we propose an extra corner attention module to enforce the CNN backbone to pay more attention to object boundaries. Besides, considering that one-stage detectors suffer from the discordance between the predicted bounding boxes and corresponding classification confidences, we develop an efficient keypoint-sensitive warping operation to align the confidences to the predicted bounding boxes. Our proposed CenterNet3D is non-maximum suppression free which makes it more efficient and simpler. We evaluate CenterNet3D on the widely used KITTI dataset and more challenging nuScenes dataset. Our method outperforms all state-of-the-art anchor-based one-stage methods and has comparable performance to two-stage methods as well. It has an inference speed of 20 FPS and achieves the best speed and accuracy trade-off. Our source code will be released at https://github.com/wangguojun2018/CenterNet3d.

View on arXiv

Authors (6)

Guojun Wang (8 papers)
Jian Wu (314 papers)
Bin Tian (11 papers)
Siyu Teng (8 papers)
Long Chen (395 papers)
Dongpu Cao (26 papers)

Citations (27)

View on Semantic Scholar

Summary

An Analysis of CenterNet3D: An Anchor-Free Object Detector for Point Cloud

The paper "CenterNet3D: An Anchor-Free Object Detector for Point Cloud" presents a substantial advancement in the domain of 3D object detection, particularly in the context of point clouds for applications in autonomous driving. The authors propose CenterNet3D, a novel approach that eschews traditional anchor-based detection mechanisms in favor of a more efficient and streamlined anchor-free method. This paper compares CenterNet3D with existing 3D object detection methodologies, illustrating its competitive performance and improved computational efficiency.

Methodology and Innovations

CenterNet3D models 3D objects in point clouds by representing each object as the center point of its bounding box, eliminating the dependency on predefined anchors and complicated post-processing operations such as non-maximum suppression (NMS). Instead, the model uses keypoint estimation techniques to identify center points and directly predicts 3D bounding boxes. Notably, the paper introduces a corner attention module to address the sparsity issues inherent in point clouds, where 3D object center points may reside in empty spaces. This module enhances the CNN backbone's ability to recognize object boundaries, thereby improving the model's capability to predict accurate boundaries for detected objects.

Crucial to the reliability of one-stage detectors is the alignment of predicted bounding boxes with classification confidences. CenterNet3D addresses this by implementing a keypoint-sensitive warping (KSWarp) operation, aligning classification confidences with localization boundaries without requiring an additional network stage. This operation enhances the consistency between object localization and classification confidence.

Practical Evaluations and Implications

The paper presents a comprehensive evaluation of CenterNet3D using the KITTI and nuScenes datasets, both standard benchmarks for 3D object detection in autonomous driving scenarios. The results indicate that CenterNet3D performs favorably against current state-of-the-art one-stage and two-stage methods, achieving a significant trade-off between speed and accuracy with an inference speed of 20 FPS. The authors emphasize the effectiveness of the CenterNet3D across varying levels of difficulty and object categories in the KITTI dataset, as well as its ability to perform well with small and dense objects in the more challenging nuScenes dataset.

Importantly, the anchor-free architecture of CenterNet3D leads to a reduction in hyperparameters and design complexity, alleviating the computational overhead associated with traditional anchor boxes used in other methods. As such, the model demonstrates effectiveness, simplicity, and efficiency, making it ideally suited for real-world applications in autonomous vehicle systems where computational resources are a critical concern.

Future Directions

The implications of such a streamlined approach are significant for the future development of autonomous systems. By adopting an anchor-free design, CenterNet3D lays the groundwork for further exploration into efficient object detection models that maintain high precision across complex driving environments. Future research could expand upon the corner attention mechanisms and confidence alignment techniques introduced by CenterNet3D, enhancing detection accuracy in edge cases defined by sparse data. Additionally, integrating multi-sensor fusion, including camera data in conjunction with point clouds, could mitigate false positives and further bolster system robustness.

In conclusion, the contribution of CenterNet3D signals an incremental step towards efficient and accurate 3D object detection, essential for the progression of safe and reliable autonomous driving technologies.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - wangguojun2018/CenterNet3d: CenterNet3D An Anchor free Object Detector for Autonomous Driving (118 stars)