CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection (2011.04841v1)

Published 10 Nov 2020 in cs.CV

Abstract: The perception system in autonomous vehicles is responsible for detecting and tracking the surrounding objects. This is usually done by taking advantage of several sensing modalities to increase robustness and accuracy, which makes sensor fusion a crucial part of the perception system. In this paper, we focus on the problem of radar and camera sensor fusion and propose a middle-fusion approach to exploit both radar and camera data for 3D object detection. Our approach, called CenterFusion, first uses a center point detection network to detect objects by identifying their center points on the image. It then solves the key data association problem using a novel frustum-based method to associate the radar detections to their corresponding object's center point. The associated radar detections are used to generate radar-based feature maps to complement the image features, and regress to object properties such as depth, rotation and velocity. We evaluate CenterFusion on the challenging nuScenes dataset, where it improves the overall nuScenes Detection Score (NDS) of the state-of-the-art camera-based algorithm by more than 12%. We further show that CenterFusion significantly improves the velocity estimation accuracy without using any additional temporal information. The code is available at https://github.com/mrnabati/CenterFusion .

Citations (253)

View on Semantic Scholar

Summary

The paper presents a novel sensor fusion method, CenterFusion, that integrates radar and camera data for enhanced 3D object detection.
The methodology leverages a center detection network and frustum association to combine radar and camera inputs, yielding a 12% NDS improvement on nuScenes.
Results show improved velocity estimation and depth accuracy, positioning CenterFusion as a promising solution for autonomous vehicle perception.

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

The paper entitled "CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection" addresses a critical challenge in the development of autonomous vehicle perception systems: sensor fusion. Specifically, it offers a novel approach to integrating radar and camera data for enhanced 3D object detection, termed CenterFusion. The authors, Ramin Nabati and Hairong Qi, present a detailed methodology and demonstrate the efficacy of their approach on the nuScenes dataset, highlighting significant improvements over existing camera-based methods.

Methodology

CenterFusion builds upon the limitations encountered with traditional LiDAR and camera fusion strategies. While LiDAR provides precise depth measurements, its performance diminishes with range and adverse weather conditions. Conversely, radar systems, long utilized in automotive applications, offer robust long-range detection and velocity estimation, unaffected by environmental conditions.

The paper introduces a middle-fusion approach that leverages radar's strengths. At the core of CenterFusion lies the use of a center point detection network, adapted from CenterNet, to identify the spatial center of objects in the photographic input. In a novel twist, the authors propose a frustum association technique, which aligns radar detections with the identified center points.

The methodology involves forming a RoI frustum—a 3D representation anchored on the 2D bounding box and projected into space with the predicted depth information. Radar detections are then carefully associated within this frustum, employing a pillar expansion strategy to mitigate inaccuracies in radar vertical measurements.

This approach culminates in an integration of radar-derived feature maps with imagery data, enabling the refined estimation of depth, orientation, and velocity of detected objects.

Results and Evaluation

The robustness of CenterFusion is substantiated through rigorous evaluation on the nuScenes dataset, which is recognized for its complexity and variability in autonomous vehicle scenarios. The model achieves a notable enhancement of more than 12% in the nuScenes Detection Score (NDS) over leading camera-based methods. Furthermore, it demonstrates superior performance in velocity estimation, a critical factor for dynamic environment awareness, without resorting to temporal cues.

Numerical results, presented in a comprehensive manner, corroborate the system's advances in accuracy across various object categories, while retaining computational efficiency.

Implications and Future Directions

CenterFusion sets a significant precedent for improving 3D object detection in autonomous systems through innovative sensor fusion. This paper not only illustrates the practical benefits of integrating radar data but also poses foundational questions for further exploration. Future directions might delve into optimizing the fusion model for real-time deployment, handling extreme edge cases, or integrating additional sensory modalities such as ultrasonic or infrared data.

The implications span broader than autonomous driving; the principles and successes of CenterFusion could inform robotics, surveillance systems, and any domain requiring robust environmental perception. As radar sensors continue to evolve, capable of denser and more complex data acquisition, the fusion methodologies outlined in this paper are expected to set the stage for next-generation perception systems.

In conclusion, Nabati and Qi's work on CenterFusion demonstrates an innovative leap forward in sensor fusion technology, especially pertinent to autonomous vehicles. It successfully marries the accuracy and resilience of radar with the rich information capture of cameras, paving the way for advancements across multiple related fields.

PDF Markdown

Related Papers

GitHub

GitHub - mrnabati/CenterFusion: CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection (516 stars)