Mesh Convolution with Continuous Filters for 3D Surface Parsing (2112.01801v3)

Published 3 Dec 2021 in cs.CV

Abstract: Geometric feature learning for 3D surfaces is critical for many applications in computer graphics and 3D vision. However, deep learning currently lags in hierarchical modeling of 3D surfaces due to the lack of required operations and/or their efficient implementations. In this paper, we propose a series of modular operations for effective geometric feature learning from 3D triangle meshes. These operations include novel mesh convolutions, efficient mesh decimation and associated mesh (un)poolings. Our mesh convolutions exploit spherical harmonics as orthonormal bases to create continuous convolutional filters. The mesh decimation module is GPU-accelerated and able to process batched meshes on-the-fly, while the (un)pooling operations compute features for up/down-sampled meshes. We provide open-source implementation of these operations, collectively termed Picasso. Picasso supports heterogeneous mesh batching and processing. Leveraging its modular operations, we further contribute a novel hierarchical neural network for perceptual parsing of 3D surfaces, named PicassoNet++. It achieves highly competitive performance for shape analysis and scene segmentation on prominent 3D benchmarks. The code, data and trained models are available at https://github.com/EnyaHermite/Picasso.

Citations (5)

View on Semantic Scholar

Summary

The paper introduces mesh convolutions with spherical harmonics to create continuous filters that enhance feature learning and enable rotational invariance.
The paper implements GPU-accelerated mesh decimation to efficiently process large-scale 3D meshes while preserving geometric integrity.
The paper develops the open-source Picasso toolkit and PicassoNet++, achieving state-of-the-art performance in 3D semantic labeling and scene segmentation.

Mesh Convolution with Continuous Filters for 3D Surface Parsing

The paper "Mesh Convolution with Continuous Filters for 3D Surface Parsing" introduces innovative methodologies for geometric feature learning using 3D triangle meshes, addressing significant limitations in existing systems that apply deep learning to 3D surfaces. The authors propose a series of computational operations, including mesh convolutions, mesh decimation algorithms, and associated pooling and unpooling operations.

Core Contributions:

Mesh Convolutions with Spherical Harmonics:
- The authors have utilized spherical harmonics as orthonormal bases for mesh convolutions to create continuous convolutional filters, allowing for efficient and effective feature learning across the vertices and facets of 3D meshes. This approach overcomes limitations associated with discrete filter partitioning, enhancing geometric representation. By parameterizing the convolutional filters with these continuous functions of azimuth and elevation angles, the method introduces rotational and translational invariance in the learned features, although the facet2vertex convolution remains rotationally dependent.
GPU-Accelerated Mesh Decimation:
- The proposed mesh decimation method improves upon the quadric error metrics (QEM) framework by enabling batch processing on GPUs, thus expediting mesh simplification without compromising the geometric integrity of the original meshes. This method is crucial for processing large-scale meshes efficiently, allowing hierarchical network design with decimated resolutions.
Picasso Toolkit:
- An open-source implementation, named Picasso, is provided, which integrates these novel mesh processing techniques into both Pytorch and TensorFlow. Picasso allows for convenient integration into modern deep learning modules, facilitating research and development in processing 3D mesh data.
PicassoNet++:
- The authors introduce PicassoNet++, a neural network architecture devised for complex semantic analysis tasks over 3D surfaces. By leveraging the modular operations of Picasso, PicassoNet++ achieves competitive performance on prominent 3D datasets, including shape analysis and scene segmentation benchmarks like ShapeNetCore, S3DIS, and ScanNet.

Key Numerical Results:

PicassoNet++ achieves state-of-the-art performance on datasets such as SHREC, CUBE, COSEG, HUMAN, and FAUST for semantic labeling and correspondence tasks.
Notably, in large-scale scene parsing, PicassoNet++ outperforms or remains competitive with existing methods while consuming fewer computational resources.

Implications and Future Directions:

The methodologies introduced offer substantial advancements in the field of geometric deep learning by providing techniques that handle the intrinsic irregularity of 3D surfaces represented as meshes. The use of continuous filter modeling with spherical harmonics marks a significant stride in addressing challenges of efficient and effective feature learning on 3D data.

The research sets the stage for future development in areas such as autonomous driving, virtual and augmented reality, and CAD modeling, where understanding and parsing 3D environments play a pivotal role. The continuous improvement of the Picasso toolkit could eventually enable real-time applications that require dynamic and large-scale 3D data processing.

The approach of extending convolutional operations through learned spatial features, as introduced by this paper, opens up avenues to explore hierarchical neural networks that can adapt seamlessly to varied applications with diverse datasets in AI, robotics, and computer vision. As this research progresses, it will likely influence the development of robust AI systems capable of complex spatial reasoning and interaction within 3D environments.

PDF Markdown

Related Papers

GitHub

GitHub - EnyaHermite/PicassoPlus: Geometric deep learning for surface parsing from 3-D triangular meshes (125 stars)