Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review (2005.09830v1)

Published 20 May 2020 in cs.CV

Abstract: Recently, the advancement of deep learning in discriminative feature learning from 3D LiDAR data has led to rapid development in the field of autonomous driving. However, automated processing uneven, unstructured, noisy, and massive 3D point clouds is a challenging and tedious task. In this paper, we provide a systematic review of existing compelling deep learning architectures applied in LiDAR point clouds, detailing for specific tasks in autonomous driving such as segmentation, detection, and classification. Although several published research papers focus on specific topics in computer vision for autonomous vehicles, to date, no general survey on deep learning applied in LiDAR point clouds for autonomous vehicles exists. Thus, the goal of this paper is to narrow the gap in this topic. More than 140 key contributions in the recent five years are summarized in this survey, including the milestone 3D deep architectures, the remarkable deep learning applications in 3D semantic segmentation, object detection, and classification; specific datasets, evaluation metrics, and the state of the art performance. Finally, we conclude the remaining challenges and future researches.

Authors (7)

Ying Li (432 papers)
Lingfei Ma (7 papers)
Zilong Zhong (4 papers)
Fei Liu (232 papers)
Dongpu Cao (26 papers)
Jonathan Li (62 papers)
Michael A. Chapman (2 papers)

Citations (342)

View on Semantic Scholar

Summary

Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review

The paper "Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review" collates advancements in the application of deep learning (DL) techniques to LiDAR point clouds for autonomous driving. This systematic review targets segmentation, detection, and classification tasks, essential for the perception systems of autonomous vehicles (AVs). It underscores the inherent challenges of processing uneven, unstructured, and voluminous 3D point clouds, proposing DL as an effective solution to these complexities.

Key Contributions and DL Architectures

The authors delineate over 140 significant contributions in the past five years, emphasizing milestone 3D DL architectures that have shown notable prowess in point cloud handling. These include the seminal voxel-based models, point cloud-based models such as PointNet and its extension PointNet++, graph-based models like DGCNN, and multiview-based models such as MVCNN. Each of these architectures addresses critical challenges, from permutation and orientation invariance to efficient computation over large data volumes. However, the paper critiques these models for often failing to capture the intricate geometric relationships among point features, suggesting room for innovation in this area.

Dataset Surveys and Evaluation Metrics

The authors also review existing datasets used in training DL models, such as Semantic3D and KITTI, noting their roles in advancing segmentation and detection accuracy. They discuss evaluation metrics for point cloud segmentation, detection, and classification, with IoU and AP being prominent. The paper highlights the data sparsity challenge in LiDAR datasets, emphasizing the need for more data-efficient models capable of performing well even with sparse representations.

Practical and Theoretical Implications

Practical implications of the reviewed technologies point towards their integration into AV systems, enhancing real-time navigation, scene understanding, and decision-making capabilities. Theoretically, the paper posits future advancements in unified models capable of simultaneously handling multiple perception tasks like segmentation, detection, and classification with greater accuracy and efficiency. Furthermore, it contemplates unsupervised or weakly supervised learning paradigms to offset the scarcity of labeled training data.

Challenges and Future Directions

While the paper acknowledges the strides made by DL in handling 3D point clouds, it underlines several ongoing challenges:

Data Representation: Finding robust, efficient data representations that maximize accuracy without incurring prohibitive computational costs remains an open question.
End-to-End Learning: Many current systems do not support end-to-end training across multi-source data fusion, which is critical for robust AV environments.
Model Efficiency: Lightweight models that deliver real-time performance on AV platforms with limited computational resources are needed.
Contextual Understanding: Current DL models often lack the ability to extract and leverage contextual information from sparse point clouds.

Conclusions

In summary, this paper systematically categorizes DL frameworks and applications tailored to LiDAR point clouds, recommending areas for future exploration. Enhancements in deep learning architectures, coupled with innovative data handling and fusion strategies, are poised to significantly advance the field of autonomous driving. The integration of LiDAR-based perceptions with complementary sensor data will be central to achieving reliable and intelligent vehicle automation.

PDF Markdown