MeshNet: Mesh Neural Network for 3D Shape Representation (1811.11424v1)

Published 28 Nov 2018 in cs.CV

Abstract: Mesh is an important and powerful type of data for 3D shapes and widely studied in the field of computer vision and computer graphics. Regarding the task of 3D shape representation, there have been extensive research efforts concentrating on how to represent 3D shapes well using volumetric grid, multi-view and point cloud. However, there is little effort on using mesh data in recent years, due to the complexity and irregularity of mesh data. In this paper, we propose a mesh neural network, named MeshNet, to learn 3D shape representation from mesh data. In this method, face-unit and feature splitting are introduced, and a general architecture with available and effective blocks are proposed. In this way, MeshNet is able to solve the complexity and irregularity problem of mesh and conduct 3D shape representation well. We have applied the proposed MeshNet method in the applications of 3D shape classification and retrieval. Experimental results and comparisons with the state-of-the-art methods demonstrate that the proposed MeshNet can achieve satisfying 3D shape classification and retrieval performance, which indicates the effectiveness of the proposed method on 3D shape representation.

Citations (264)

View on Semantic Scholar

Summary

The paper introduces a novel deep learning architecture that decomposes complex mesh data into spatial and structural features for 3D shape representation.
It employs face connectivity and mesh convolution blocks to aggregate information from neighboring faces while minimizing computational overhead.
Empirical evaluations on ModelNet40 show 91.9% classification accuracy and 81.9% mean average precision, confirming its robust performance.

MeshNet: Mesh Neural Network for 3D Shape Representation

The paper introduces MeshNet, a deep learning architecture designed to address the complexities of 3D shape representation through mesh data. Mesh data traditionally poses challenges due to its complexity and irregularity, deriving from the heterogeneous nature of meshes as collections of vertices, edges, and faces. This work departs from existing approaches that use volumetric grids, multi-view representations, or point clouds, positioning MeshNet as a notable contribution by leveraging the detailed geometric and spatial information intrinsically present in mesh structures.

Core Contributions

MeshNet distinguishes itself through a novel approach that involves the decomposition and reassembly of mesh features into spatial and structural components. By treating the polygon face as the fundamental unit and introducing face connectivity based on shared edges, the authors alleviate the inherent complexity and irregularity issues associated with mesh data. This design choice enables a robust per-face learning process similar to point cloud methodologies, such as PointNet, but tailored for mesh data.

The architectural innovation can be summarized through its key components:

Spatial and Structural Descriptors: The extraction of spatial features employs multi-layer perceptrons (MLPs) applied to the face centers, while structural descriptors use face rotate convolution and face kernel correlation. The former captures "inner" face features related to face shape, while the latter identifies "outer" features associated with surrounding face alignment.
Mesh Convolution Block: This segment enhances the receptive fields around each face by aggregating information from neighboring faces. The concatenation method optimally combines spatial with structural features to produce enriched representations.

These mechanisms result in an architecture capable of maintaining low computational overhead while enhancing representation power.

Empirical Evaluation

The efficacy of MeshNet is substantiated by experimental validation on the ModelNet40 dataset. For classification tasks, the MeshNet demonstrates a substantial accuracy of 91.9%, comparable to alternative 3D data representations such as point-based or volume-based methods. Additionally, in retrieval tasks, the mean average precision of 81.9% reflects strong performance superior to the previous utilization of handcrafted mesh features.

Implications and Future Directions

MeshNet's ability to efficiently process and represent 3D shapes using mesh data creates a robust foundation for extending its application to wider computer vision tasks. The observed robustness to face number variations indicates robustness and flexibility, promising applicability in diverse contexts where 3D mesh data are prevalent. Future investigations might explore additional optimizations or hybrid approaches that could further improve representation efficacy or extend applicability across more complex datasets and tasks.

In conclusion, MeshNet represents a significant stride towards more nuanced and effective 3D shape representation leveraging mesh data. By systematically addressing the challenges posed by mesh complexity and irregularity, it opens avenues for advanced applications, allowing deeper integration of geometric processing within the field of deep learning for 3D shapes.

PDF Markdown