Infinite Texture: Text-guided High Resolution Diffusion Texture Synthesis (2405.08210v1)
Abstract: We present Infinite Texture, a method for generating arbitrarily large texture images from a text prompt. Our approach fine-tunes a diffusion model on a single texture, and learns to embed that statistical distribution in the output domain of the model. We seed this fine-tuning process with a sample texture patch, which can be optionally generated from a text-to-image model like DALL-E 2. At generation time, our fine-tuned diffusion model is used through a score aggregation strategy to generate output texture images of arbitrary resolution on a single GPU. We compare synthesized textures from our method to existing work in patch-based and deep learning texture synthesis methods. We also showcase two applications of our generated textures in 3D rendering and texture transfer.
- Multidiffusion: Fusing diffusion paths for controlled image generation. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, pages 1737–1752. PMLR, 2023.
- Learning texture manifolds with the periodic spatial gan. arXiv preprint arXiv:1705.06566, 2017.
- Image melding: Combining inconsistent images using patch-based synthesis. ACM Transactions on graphics (TOG), 31(4):1–10, 2012.
- Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
- Image quilting for texture synthesis and transfer. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 341–346, 2001.
- Texture synthesis by non-parametric sampling. In Proceedings of the seventh IEEE international conference on computer vision, pages 1033–1038. IEEE, 1999.
- Texture synthesis using convolutional neural networks. Advances in neural information processing systems, 28, 2015a.
- A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015b.
- Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 327–340, 2001.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Texture synthesis with spatial generative adversarial networks. arXiv preprint arXiv:1611.08207, 2016.
- Perceptual losses for real-time style transfer and super-resolution. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 694–711. Springer, 2016.
- Self tuning texture optimization. In Computer Graphics Forum, pages 349–359. Wiley Online Library, 2015.
- Graphcut textures: Image and video synthesis using graph cuts. Acm transactions on graphics (tog), 22(3):277–286, 2003.
- Diversified texture synthesis with feed-forward networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3920–3928, 2017.
- Real-time texture synthesis by patch-based sampling. ACM Transactions on Graphics (ToG), 20(3):127–150, 2001.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Quantitative evaluation of near regular texture synthesis algorithms. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), pages 427–434. IEEE, 2006.
- Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv:2206.00927, 2022.
- Generating images from captions with attention. arXiv preprint arXiv:1511.02793, 2015.
- On distillation of guided diffusion models. arXiv preprint arXiv:2210.03142, 2022.
- μ𝜇\muitalic_μnca: Texture generation with ultra-compact neural cellular automata. arXiv preprint arXiv:2111.13545, 2021.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
- Self-organising textures. Distill, 6(2):e00027–003, 2021.
- Lapped textures. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 465–470, 2000.
- Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3), 2022.
- Generative adversarial text to image synthesis. In International conference on machine learning, pages 1060–1069. PMLR, 2016.
- High-resolution image synthesis with latent diffusion models, 2021.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22500–22510, 2023.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
- Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
- Df-gan: A simple and effective baseline for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16515–16525, 2022.
- Greg Turk. Generating textures on arbitrary surfaces using reaction-diffusion. Acm Siggraph Computer Graphics, 25(4):289–298, 1991.
- Greg Turk. Texture synthesis on surfaces. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 347–354, 2001.
- Texture networks: Feed-forward synthesis of textures and stylized images. arXiv preprint arXiv:1603.03417, 2016.
- Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6924–6932, 2017.
- Toward a universal model for shape from texture. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Unique geometry and texture from corresponding image patches. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(12):4519–4522, 2021.
- Pretraining is all you need for image-to-image translation. arXiv preprint arXiv:2205.12952, 2022.
- Fast texture synthesis using tree-structured vector quantization. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 479–488, 2000.
- Caltech-UCSD Birds 200. 2011.
- Reaction-diffusion textures. In Proceedings of the 18th annual conference on computer graphics and interactive techniques, pages 299–308, 1991.
- Attngan: Fine-grained text to image generation with attentional generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1316–1324, 2018.
- Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 5907–5915, 2017.
- Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
- Non-stationary texture synthesis by adversarial expansion. arXiv preprint arXiv:1805.04487, 2018.
- Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
- Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5802–5810, 2019.