A review on deep learning techniques for 3D sensed data classification (1907.04444v1)

Published 9 Jul 2019 in cs.CV

Abstract: Over the past decade deep learning has driven progress in 2D image understanding. Despite these advancements, techniques for automatic 3D sensed data understanding, such as point clouds, is comparatively immature. However, with a range of important applications from indoor robotics navigation to national scale remote sensing there is a high demand for algorithms that can learn to automatically understand and classify 3D sensed data. In this paper we review the current state-of-the-art deep learning architectures for processing unstructured Euclidean data. We begin by addressing the background concepts and traditional methodologies. We review the current main approaches including; RGB-D, multi-view, volumetric and fully end-to-end architecture designs. Datasets for each category are documented and explained. Finally, we give a detailed discussion about the future of deep learning for 3D sensed data, using literature to justify the areas where future research would be most valuable.

Citations (161)

View on Semantic Scholar

Summary

Deep Learning Techniques for 3D Sensed Data Classification: An Expert Overview

The paper "A review on deep learning techniques for 3D sensed data classification" by Griffiths and Boehm offers a comprehensive survey of current methodologies applied to the classification and segmentation of three-dimensional (3D) sensed data. With the growing need for intelligent systems capable of interpreting complex 3D environments, this paper navigates through the existing landscape, addressing essential concepts, outlining datasets, and scrutinizing state-of-the-art techniques.

Overview of 3D Data Processing

The research emphasizes the challenges involved in 3D data interpretation, which fundamentally differ from 2D image tasks due to aspects such as unstructured data formats. Point clouds, as a primary form of 3D data, require significant adaptation of existing methodologies, particularly in segmentation, where neighborhood relations are implicit rather than explicit. This complexity necessitates robust solutions in applications ranging from autonomous navigation to augmented reality.

Deep Learning Architectures for 3D Understanding

The review identifies several deep learning frameworks pertinent to 3D data understanding. Four main approaches underscore current methodologies: RGB-D processing, multi-view networks, volumetric data processing, and unordered point set processing. Each approach has its own advantages and limitations, reflecting the versatility needed to tackle varying data complexities and application requirements.

RGB-D Networks: These exploit color and depth information via low-cost sensors, primarily in controlled environments such as indoor navigation. Techniques integrating RGB-D data into convolutional neural networks (CNNs) harness the extra depth dimension for improved accuracy over 2D counterparts.
Multi-View Convolutional Networks (MVCNs): MVCNs overcome some limitations of volumetric data by projecting objects into multiple 2D views and leveraging mature 2D CNN architectures. Despite strong performance in object classification tasks, their applicability may be limited by the need for pre-existing complete meshes and their handling of occlusions.
Volumetric Approaches: These methods utilize voxel grids to maintain geometric consistency but are often computationally intensive. Recent architectures explore sparse representations and hierarchical feature learning to address efficiency issues.
PointNet and Its Derivatives: Directly consuming unordered point clouds, PointNet and its enhancements signify a shift towards handling raw 3D data without preprocessing into structured formats. This line of research boasts impressive results across several benchmark datasets and highlights ongoing developments in local feature learning and object segmentation.

Benchmark Datasets and Future Directions

The paper also reviews key datasets vital for benchmarking 3D deep learning models. These datasets, ranging from indoor RGB-D to outdoor aerial point clouds, underpin the research community's efforts to standardize evaluation metrics and facilitate progress. Furthermore, the discussion section highlights the importance of tailored approaches depending on the application context and anticipates emerging opportunities, notably in object detection and unsupervised feature learning.

Conclusion

Griffiths and Boehm’s survey underscores the dynamic nature of 3D deep learning research. With significant advancements across different paradigms, there remains no singular approach suited to all 3D data challenges. The exploration of hybrid methods that combine elements from existing frameworks could yield further improvements in versatility and accuracy. Additionally, the consistent integration of more detailed datasets will foster advancements in applications reliant on 3D data interpretation. As the field matures, continued innovation and optimization of these methodologies are expected to drive consequential applications across diverse technological domains.