Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion (2003.01456v2)

Published 3 Mar 2020 in cs.CV and cs.LG

Abstract: While many works focus on 3D reconstruction from images, in this paper, we focus on 3D shape reconstruction and completion from a variety of 3D inputs, which are deficient in some respect: low and high resolution voxels, sparse and dense point clouds, complete or incomplete. Processing of such 3D inputs is an increasingly important problem as they are the output of 3D scanners, which are becoming more accessible, and are the intermediate output of 3D computer vision algorithms. Recently, learned implicit functions have shown great promise as they produce continuous reconstructions. However, we identified two limitations in reconstruction from 3D inputs: 1) details present in the input data are not retained, and 2) poor reconstruction of articulated humans. To solve this, we propose Implicit Feature Networks (IF-Nets), which deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data retaining the nice properties of recent learned implicit functions, but critically they can also retain detail when it is present in the input data, and can reconstruct articulated humans. Our work differs from prior work in two crucial aspects. First, instead of using a single vector to encode a 3D shape, we extract a learnable 3-dimensional multi-scale tensor of deep features, which is aligned with the original Euclidean space embedding the shape. Second, instead of classifying x-y-z point coordinates directly, we classify deep features extracted from the tensor at a continuous query point. We show that this forces our model to make decisions based on global and local shape structure, as opposed to point coordinates, which are arbitrary under Euclidean transformations. Experiments demonstrate that IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.

Citations (465)

View on Semantic Scholar

Summary

The paper introduces IF-Nets, which leverage multi-scale feature tensors for detailed 3D shape reconstruction and completion.
It employs feature-based classification at continuous query points to preserve both global and local shape structures effectively.
Experiments demonstrate superior performance in point cloud completion, voxel super-resolution, and single-view human reconstruction tasks.

Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion

The paper "Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion" introduces an innovative approach, termed Implicit Feature Networks (IF-Nets), which targets the challenges associated with reconstructing and completing 3D shapes from various imperfect 3D inputs. These include low and high-resolution voxels, sparse and dense point clouds, and inputs that may be either complete or incomplete.

Methods and Contributions

IF-Nets address two critical limitations of existing learned implicit function models: the inability to retain input detail and the inadequate reconstruction of articulated humans. The paper presents several key advancements:

Multi-scale Feature Tensor: Instead of encoding 3D shapes in a single vector, the proposed method extracts a three-dimensional multi-scale tensor of deep features. This encoding aligns with the Euclidean space of the original shape, allowing for a more comprehensive preservation of detail and structure.
Feature-based Classification: Rather than classifying x-y-z point coordinates directly, which are arbitrary under Euclidean transformations, IF-Nets classify deep features extracted from the tensor at continuous query points. This ensures that the network makes decisions based on global and local shape structures.

These approaches collectively enable IF-Nets to produce detailed, continuous reconstructions that surpass much of the prior work, particularly in handling articulated human forms, which traditional methods struggle to represent accurately.

Numerical Results and Validation

Extensive experiments demonstrate the superiority of IF-Nets over existing methods across several tasks, including:

Point Cloud Completion: IF-Nets outperform Occupancy Networks, Deep Marching Cubes, and Point Set Generation Networks regarding detail retention and global structure recovery. Quantitative results show significant improvements in Chamfer distance and Normal Consistency metrics.
Voxel Super-Resolution: The ability of IF-Nets to enhance voxel-based data from sparse inputs highlights their robustness and efficiency in preserving intricate details, demonstrated through superior IoU and Chamfer- $L_2$ scores.
Single-View Human Reconstruction: IF-Nets successfully reconstruct detailed, plausible shapes from single-view point clouds, tackling the challenge of missing data in occluded regions of human forms.

Implications and Future Directions

The proposed methodology has notable theoretical and practical implications, offering a versatile and robust solution for 3D reconstruction tasks in both academic research and practical applications, such as in graphics, virtual reality, or autonomous systems. The authors suggest potential extensions towards generative models, which could sample conditioned hypotheses based on partial input, and further exploration into image-based reconstructions.

Overall, the research sets a foundation for future explorations that are likely to continue evolving the capabilities of learned implicit functions, addressing increasingly complex 3D reconstruction challenges across varied domains.

PDF Markdown