FlashTex: Fast Relightable Mesh Texturing with LightControlNet
Abstract: Manually creating textures for 3D meshes is time-consuming, even for expert visual content creators. We propose a fast approach for automatically texturing an input 3D mesh based on a user-provided text prompt. Importantly, our approach disentangles lighting from surface material/reflectance in the resulting texture so that the mesh can be properly relit and rendered in any lighting environment. We introduce LightControlNet, a new text-to-image model based on the ControlNet architecture, which allows the specification of the desired lighting as a conditioning image to the model. Our text-to-texture pipeline then constructs the texture in two stages. The first stage produces a sparse set of visually consistent reference views of the mesh using LightControlNet. The second stage applies a texture optimization based on Score Distillation Sampling (SDS) that works with LightControlNet to increase the texture quality while disentangling surface material from lighting. Our algorithm is significantly faster than previous text-to-texture methods, while producing high-quality and relightable textures.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Neural reflectance fields for appearance acquisition. arXiv preprint arXiv:2008.03824, 2020.
- Demystifying mmd gans. In International Conference on Learning Representations (ICLR), 2018.
- Mesh2tex: Generating mesh textures from image queries. In IEEE International Conference on Computer Vision (ICCV), 2023.
- John Canny. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6):679–698, 1986.
- Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299, 2017.
- Text2tex: Text-driven texture synthesis via diffusion models. In IEEE International Conference on Computer Vision (ICCV), 2023a.
- Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation. In IEEE International Conference on Computer Vision (ICCV), 2023b.
- Tango: Text-driven photorealistic and robust 3d stylization via lighting decomposition. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Objaverse: A universe of annotated 3d objects. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Deep inverse rendering for high-resolution svbrdf estimation from an arbitrary number of images. In ACM SIGGRAPH, 2019.
- Leveraging 2D data to learn textured 3D mesh generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Prompt-to-prompt image editing with cross attention control. arXiv preprint arXiv:2208.01626, 2022.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- Dreamtime: An improved optimization strategy for text-to-3d content creation. arXiv preprint arXiv:2306.12422, 2023.
- Modular primitives for high-performance differentiable rendering. 2020.
- Diffusion-sdf: Text-to-shape via voxelized diffusion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
- Sweetdreamer: Aligning geometric priors in 2d diffusion for consistent text-to-3d. arxiv:2310.02596, 2023b.
- Gligen: Open-set grounded text-to-image generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023c.
- Materials for masses: Svbrdf acquisition with a single mobile phone image. In European Conference on Computer Vision (ECCV), 2018.
- Magic3d: High-resolution text-to-3d content creation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Microsoft coco: Common objects in context, 2015.
- Zero-1-to-3: Zero-shot one image to 3d object. In IEEE International Conference on Computer Vision (ICCV), 2023.
- SDEdit: Guided image synthesis and editing with stochastic differential equations. In International Conference on Learning Representations (ICLR), 2022.
- Latent-nerf for shape-guided generation of 3d shapes and textures. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision (ECCV), 2020.
- T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. arXiv preprint arXiv:2302.08453, 2023.
- Instant neural graphics primitives with a multiresolution hash encoding. In ACM SIGGRAPH, 2022.
- Extracting Triangular 3D Models, Materials, and Lighting From Images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- 3d-ldm: Neural implicit 3d shape generation with latent diffusion models. arXiv preprint arXiv:2212.00842, 2022.
- Fred E Nicodemus. Directional reflectance and emissivity of an opaque surface. Applied optics, 4(7):767–775, 1965.
- Photoshape: Photorealistic materials for large-scale shape collections. 2018.
- On aliased resizing and surprising subtleties in gan evaluation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Learning generative models of textured 3d meshes from real-world images. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Dreamfusion: Text-to-3d using 2d diffusion. In International Conference on Learning Representations (ICLR), 2023.
- Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML), 2021.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- Texture: Text-guided texturing of 3d shapes. In ACM SIGGRAPH, 2023.
- High-resolution image synthesis with latent diffusion models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Matfusion: a generative diffusion model for svbrdf capture. In ACM SIGGRAPH Asia, 2023.
- Mvdream: Multi-view diffusion for 3d generation. arXiv:2308.16512, 2023.
- 3d neural field generation using triplane diffusion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Texturify: Generating textures on 3d shape surfaces. In European Conference on Computer Vision (ECCV), 2022.
- Dreamcraft3d: Hierarchical 3d generation with bootstrapped diffusion prior, 2023.
- Controlmat: Controlled generative approach to material capture. arXiv preprint arXiv:2309.01700, 2023a.
- Matfuse: Controllable material generation with diffusion models. arXiv preprint arXiv:2308.11408, 2023b.
- Microfacet models for refraction through rough surfaces. In Proceedings of the 18th Eurographics conference on Rendering Techniques, 2007.
- Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
- Pretraining is all you need for image-to-image translation. arXiv preprint arXiv:2205.12952, 2022.
- Learning indoor inverse rendering with 3d spatially-varying lighting. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. In Advances in Neural Information Processing Systems (NeurIPS), 2023b.
- Nerfiller: Completing scenes via generative 3d inpainting. In arXiv, 2023.
- Open-vocabulary panoptic segmentation with text-to-image diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2955–2966, 2023a.
- Matlaber: Material-aware text-to-3d via latent brdf auto-encoder. arXiv preprint arXiv:2308.09278, 2023b.
- Learning texture generators for 3d shape collections from internet photo sets. 2021.
- Adding conditional control to text-to-image diffusion models. In IEEE International Conference on Computer Vision (ICCV), 2023.
- Efficientdreamer: High-fidelity and robust 3d creation via orthogonal-view diffusion prior. 2023.
- 3d shape generation and completion through point-voxel diffusion. In IEEE International Conference on Computer Vision (ICCV), 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.