CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner (2405.14979v1)

Published 23 May 2024 in cs.GR and cs.CV

Abstract: We present a novel generative 3D modeling system, coined CraftsMan, which can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces, and, notably, allows for refining the geometry in an interactive manner. Despite the significant advancements in 3D generation, existing methods still struggle with lengthy optimization processes, irregular mesh topologies, noisy surfaces, and difficulties in accommodating user edits, consequently impeding their widespread adoption and implementation in 3D modeling software. Our work is inspired by the craftsman, who usually roughs out the holistic figure of the work first and elaborates the surface details subsequently. Specifically, we employ a 3D native diffusion model, which operates on latent space learned from latent set-based 3D representations, to generate coarse geometries with regular mesh topology in seconds. In particular, this process takes as input a text prompt or a reference image and leverages a powerful multi-view (MV) diffusion model to generate multiple views of the coarse geometry, which are fed into our MV-conditioned 3D diffusion model for generating the 3D geometry, significantly improving robustness and generalizability. Following that, a normal-based geometry refiner is used to significantly enhance the surface details. This refinement can be performed automatically, or interactively with user-supplied edits. Extensive experiments demonstrate that our method achieves high efficacy in producing superior-quality 3D assets compared to existing methods. HomePage: https://craftsman3d.github.io/, Code: https://github.com/wyysf-98/CraftsMan

References (66)

Citations (11)

View on Semantic Scholar

Summary

The paper introduces CraftsMan, which generates high-fidelity 3D models by integrating a 3D native diffusion model with a normal-based geometry refiner.
It employs a two-stage approach that first creates coarse geometries and then allows interactive user edits to refine intricate surface details.
Experimental results demonstrate that CraftsMan outperforms existing methods in Chamfer Distance and Volume IoU, setting new standards in mesh generation.

Overview of CraftsMan: A Generative 3D Modeling System

The paper presents a novel generative 3D modeling system, termed CraftsMan, which aims to generate high-fidelity 3D geometries featuring varied shapes, regular mesh topologies, and detailed surfaces. The distinguishing feature of CraftsMan is its capability to allow interactive refinement of generated geometries. Traditional 3D generation methods often struggle with time-consuming optimization processes, irregular mesh topologies, noisy surfaces, and limited capacity for user edits. CraftsMan addresses these issues by drawing inspiration from the workflow of a craftsman, who first roughs out the general shape and subsequently refines the intricate details.

System Architecture

The CraftsMan system comprises two key stages: a 3D native diffusion model for coarse geometry generation and a normal-based geometry refiner for enhancing surface details. This separation allows efficient and robust 3D asset creation from single reference images or text prompts.

3D Native Diffusion Model: This model operates in a latent space learned from 3D representations. By leveraging a multi-view (MV) diffusion model, CraftsMan generates multiple views of the coarse geometry which are subsequently fed into a 3D diffusion model. This approach greatly improves the robustness and generalizability of the generated 3D assets.
Geometry Refinement: The refinement stage utilizes a normal-based geometry refiner to enhance surface details. This refinement can be performed automatically or interactively, allowing for user-supplied edits to the geometry. The process is underpinned by ControlNet-tile and surface normal map diffusion, facilitating efficient mesh optimization while maintaining the original topology.

Numerical Evaluation and Results

Extensive experiments demonstrate that CraftsMan significantly outperforms existing methods in generating high-quality 3D assets. The method was evaluated using the Google Scanned Object (GSO) dataset, with metrics such as Chamfer Distance (CD) and Volume Intersection over Union (IoU). Quantitative results show that CraftsMan achieves comparable or superior performance to current generative models.

For instance, CraftsMan recorded a Chamfer Distance of 0.0355 and a Volume IoU of 0.5092, outperforming methods like Point-E and Shap-E, which had higher Chamfer Distances and lower IoUs. Additionally, when compared to InstantMesh, which produces accurate geometries but lacks detail, CraftsMan generated intricate and faithful representations of the input prompts within significantly reduced inference times.

Implications and Future Directions

The implications of CraftsMan are multifaceted. Practically, it offers an efficient and user-friendly tool for industries such as video gaming, augmented reality, and film production, where rapid and detailed 3D asset creation is in high demand. Theoretically, the system sets a precedent for integrating multi-view conditions and interactive refinement within generative 3D modeling, thus addressing long-standing challenges in the field.

Future developments in this area might focus on enhancing the controllability of the Latent Set Diffusion Model and exploring methods for generating textures along with geometries. Additionally, further research into expanding and diversifying the 3D datasets used for training could significantly improve the generalizability and robustness of such models.

Conclusion

CraftsMan represents a significant advancement in generative 3D modeling, effectively bridging the gap between coarse geometry generation and detailed refinement. Its ability to combine multi-view diffusion conditions with interactive refinement opens new avenues for producing high-fidelity 3D assets efficiently. While there remain challenges and opportunities for further enhancement, CraftsMan demonstrates the potential for next-generation 3D modeling systems in both research and practical applications.

PDF Markdown

Tweets

https://twitter.com/_akhaliq/status/1794910829998920134

https://twitter.com/AdeenaY8/status/1795048548335071318