The Perfect Match: 3D Point Cloud Matching with Smoothed Densities (1811.06879v3)

Published 16 Nov 2018 in cs.CV

Abstract: We propose 3DSmoothNet, a full workflow to match 3D point clouds with a siamese deep learning architecture and fully convolutional layers using a voxelized smoothed density value (SDV) representation. The latter is computed per interest point and aligned to the local reference frame (LRF) to achieve rotation invariance. Our compact, learned, rotation invariant 3D point cloud descriptor achieves 94.9% average recall on the 3DMatch benchmark data set, outperforming the state-of-the-art by more than 20 percent points with only 32 output dimensions. This very low output dimension allows for near realtime correspondence search with 0.1 ms per feature point on a standard PC. Our approach is sensor- and sceneagnostic because of SDV, LRF and learning highly descriptive features with fully convolutional layers. We show that 3DSmoothNet trained only on RGB-D indoor scenes of buildings achieves 79.0% average recall on laser scans of outdoor vegetation, more than double the performance of our closest, learning-based competitors. Code, data and pre-trained models are available online at https://github.com/zgojcic/3DSmoothNet.

Citations (423)

View on Semantic Scholar

Summary

The paper introduces 3DSmoothNet, a novel 3D descriptor using SDV for efficient and robust point cloud matching with compact outputs.
It employs a voxelized SDV representation aligned with a Local Reference Frame to enforce rotation invariance and enhance gradient flow.
Experiments on 3DMatch and ETH benchmarks show a 94.9% recall rate, significantly outperforming existing state-of-the-art methods.

An Analysis of "The Perfect Match: 3D Point Cloud Matching with Smoothed Densities"

This essay explores the contributions and findings of the paper titled "The Perfect Match: 3D Point Cloud Matching with Smoothed Densities" by Zan Gojcic et al. The paper presents 3DSmoothNet, a novel approach to 3D point cloud matching leveraging a siamese convolutional neural network (CNN) architecture. The research is distinguished by its innovative use of the Smoothed Density Value (SDV) for input parameterization, enhancing the descriptiveness and efficiency of 3D point cloud descriptors.

Core Contributions

The primary contribution of this research is the introduction of a new 3D local feature descriptor that is compact and computationally efficient. 3DSmoothNet achieves this by leveraging the SDV, which reduces input sparsity and facilitates the learning of highly descriptive features with low-dimensional outputs (16 or 32 dimensions). This is particularly significant as it allows for near real-time correspondence searches.

Methodology

3DSmoothNet employs a voxelized SDV representation, computed per interest point and aligned with the Local Reference Frame (LRF) to ensure rotation invariance. The use of SDV improves the gradient flow during backpropagation and mitigates boundary effects and noise, crucial for robust point cloud matching.

The network architecture utilizes fully convolutional layers designed to process SDV voxel grids. This strategic decision facilitates the extraction of local geometric features, enabling effective generalization across different scenes and sensor modalities. Notably, the network achieves high descriptiveness while maintaining low output dimensions, significantly speeding up correspondence searches in 3D point cloud data.

Experimental Results

Experiments demonstrate that 3DSmoothNet achieves exceptional performance on the 3DMatch benchmark dataset, with a recall rate of 94.9%, thus outperforming the state-of-the-art by over 20 percentage points. This performance is consistent across diverse indoor and outdoor scenes, highlighting the model's robust generalization capabilities.

Additionally, when evaluated on the ETH dataset, which includes laser scans of outdoor vegetation, 3DSmoothNet maintained high recall rates, indicating strong generalization from indoor RGB-D scene training. These outcomes emphasize the efficacy of the SDV representation in learning descriptors that are both sensor- and scene-agnostic.

Implications and Future Directions

The implications of 3DSmoothNet are significant for fields that require efficient and reliable 3D point cloud matching, such as 3D reconstruction, robotic navigation, and augmented reality. By achieving high performance with low-dimensional descriptors, 3DSmoothNet reduces computational costs, enabling real-time applications.

Future research could explore the integration of 3DSmoothNet with multi-modal data and its application in increasingly diverse and dynamic environments. Moreover, expanding the model's capabilities to deal with non-rigid transformations could further enhance its versatility.

In conclusion, the paper presents a compelling approach to 3D point cloud matching that balances descriptiveness, computational efficiency, and generalizability. The introduction of SDV as an input parameterization marks a notable advancement in 3D data processing, providing a foundation for further exploration and optimization in the domain of 3D vision technologies.

PDF Markdown

Related Papers

GitHub

GitHub - zgojcic/3DSmoothNet: [CVPR2019] The Perfect Match: 3D Point Cloud Matching with Smoothed Densities (497 stars)

Tweets

https://twitter.com/yuewang314/status/1778174688226910731