TEXTure: Text-Guided Texturing of 3D Shapes (2302.01721v1)

Published 3 Feb 2023 in cs.CV and cs.GR

Abstract: In this paper, we present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes. Leveraging a pretrained depth-to-image diffusion model, TEXTure applies an iterative scheme that paints a 3D model from different viewpoints. Yet, while depth-to-image models can create plausible textures from a single viewpoint, the stochastic nature of the generation process can cause many inconsistencies when texturing an entire 3D object. To tackle these problems, we dynamically define a trimap partitioning of the rendered image into three progression states, and present a novel elaborated diffusion sampling process that uses this trimap representation to generate seamless textures from different views. We then show that one can transfer the generated texture maps to new 3D geometries without requiring explicit surface-to-surface mapping, as well as extract semantic textures from a set of images without requiring any explicit reconstruction. Finally, we show that TEXTure can be used to not only generate new textures but also edit and refine existing textures using either a text prompt or user-provided scribbles. We demonstrate that our TEXTuring method excels at generating, transferring, and editing textures through extensive evaluation, and further close the gap between 2D image generation and 3D texturing.

Citations (222)

View on Semantic Scholar

Summary

The paper introduces a novel text-guided method that partitions rendered images into keep, refine, and generate regions for consistent 3D texturing.
It utilizes a modified diffusion sampling process with depth- and mask-guidance to transfer textures across various 3D geometries.
Empirical evaluations show TEXTure outperforms methods like Text2Mesh, offering faster texture generation and enhanced editing capabilities.

An Overview of TEXTure: Text-Guided Texturing of 3D Shapes

The paper "TEXTure: Text-Guided Texturing of 3D Shapes" introduces a new method that effectively extends text-to-image models for use in texturing 3D shapes. This technique leverages a pretrained depth-to-image diffusion model, employing an iterative process that textures a 3D model from numerous viewpoints.

Key Methodology

The TEXTure system innovates upon previous efforts in texturing 3D models through a modified diffusion sampling process. This process is guided by a novel trimap partitioning strategy, which divides the rendered image into three progression states: \textit{keep}, \textit{refine}, and \textit{generate} regions. Each region serves a specific function:

\textit{Generate} regions encompass areas that are viewed for the first time and thus require texturing.
\textit{Refine} regions are partially painted regions seen from more optimal angles and require some degree of repainting.
\textit{Keep} regions are areas that have been satisfactorily painted and should remain unaltered from the current viewpoint.

Through this dynamic partitioning, TEXTure adeptly ensures that textures remain consistent across different views, an advancement over existing methods which often face challenges around global consistency due to the stochastic nature of texture generation.

Another core contribution of TEXTure is its ability to transfer textures across different 3D geometries without necessitating explicit surface mappings. This is achieved through the integration of depth-guided and mask-guided diffusion models, which facilitate seamless and globally consistent texture generation.

Practical Applications

TEXTure extends beyond mere texture generation; it includes capabilities for texture editing and refinement. Users can modify existing textures using a text prompt or by directly applying edits, such as scribbles, to the texture map. This flexibility opens up novel possibilities for practical applications in graphics design, game development, and 3D modeling where texture detail and consistency are paramount.

Evaluation and Implications

The empirical evaluations presented in the paper indicate that TEXTure significantly increases both the speed and quality of texture generation as compared to contemporary approaches like Text2Mesh and Latent-Paint. The method effectively bridges the gap between detailed 2D image generation and 3D texturing, as observed in user studies and quantitative comparisons.

By harnessing powerful diffusion models, TEXTure introduces a faster and more reliable path to generating detailed 3D textures. Although it marks a significant step forward, the authors acknowledge certain limitations, such as depth inconsistencies in specific scenarios. These challenges, however, pave the way for future enhancements focusing on model robustness and the dynamic selection of optimal viewpoints.

Conclusion

The TEXTure method represents a significant advancement in text-guided 3D texturing and editing by building on the strengths of diffusion models. It opens up new vistas for computational creativity and efficiency while providing practical solutions for commonly encountered texture generation issues. Going forward, this line of inquiry holds promise for further refinements and broader applications within the field of 3D graphics.

PDF Markdown