Compositional Neural Textures (2404.12509v2)
Abstract: Texture plays a vital role in enhancing visual richness in both real photographs and computer-generated imagery. However, the process of editing textures often involves laborious and repetitive manual adjustments of textons, which are the recurring local patterns that characterize textures. This work introduces a fully unsupervised approach for representing textures using a compositional neural model that captures individual textons. We represent each texton as a 2D Gaussian function whose spatial support approximates its shape, and an associated feature that encodes its detailed appearance. By modeling a texture as a discrete composition of Gaussian textons, the representation offers both expressiveness and ease of editing. Textures can be edited by modifying the compositional Gaussians within the latent space, and new textures can be efficiently synthesized by feeding the modified Gaussians through a generator network in a feed-forward manner. This approach enables a wide range of applications, including transferring appearance from an image texture to another image, diversifying textures,texture interpolation, revealing/modifying texture variations, edit propagation, texture animation, and direct texton manipulation. The proposed approach contributes to advancing texture analysis, modeling, and editing techniques, and opens up new possibilities for creating visually appealing images with controllable textures.
- Time-varying weathering in texture space. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1–11.
- Learning texture manifolds with the periodic spatial GAN. arXiv preprint arXiv:1705.06566 (2017).
- Stephen Brooks and Neil Dodgson. 2002. Self-similarity based texture editing. ACM Transactions on Graphics (TOG) 21, 3 (2002), 653–656.
- Reference-based image super-resolution with deformable attention transformer. In European conference on computer vision. Springer, 325–342.
- End-to-end object detection with transformers. In European Conference on Computer Vision. 213–229.
- Design of 2d time-varying vector fields. IEEE Transactions on Visualization and Computer Graphics 18, 10 (2011), 1717–1730.
- General image-to-image translation with one-shot image guidance. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 22736–22746.
- Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1290–1299.
- Revealing and Modifying Non-Local Variations in a Single Image. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) (2015).
- Alexei A. Efros and William T. Freeman. 2001. Image Quilting for Texture Synthesis and Transfer. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 341–346. https://doi.org/10.1145/383259.383296
- Savi++: Towards end-to-end object-centric learning from real-world videos. arXiv preprint arXiv:2206.07764 (2022).
- BlobGAN: Spatially disentangled scene representations. ECCV (2022).
- Example-based super-resolution. IEEE Computer graphics and Applications 22, 2 (2002), 56–65.
- Tilegan: synthesis of large-scale non-homogeneous textures. ACM Transactions on Graphics (ToG) 38, 4 (2019), 1–11.
- Texture synthesis using convolutional neural networks. In International Conference on Neural Information Processing Systems. 262–270.
- MatFormer: A generative model for procedural materials. arXiv preprint arXiv:2207.01044 (2022).
- Diffusion-based Holistic Texture Rectification and Synthesis. SIGGRAPH Asia Conference Paper (2023).
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
- Autolink: Self-supervised learning of human skeletons and object outlines by linking keypoints. arXiv preprint arXiv:2205.10636 (2022).
- Ganseg: Learning to segment by unsupervised hierarchical image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1225–1235.
- Learning a neural 3d texture space from 2d exemplars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8356–8364.
- Spaghetti: Editing implicit shapes through part aware generation. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–20.
- Image Analogies. In SIGGRAPH ’01. Association for Computing Machinery, New York, NY, USA, 327–340. https://doi.org/10.1145/383259.383295
- Generating Procedural Materials from Text or Image Prompts. In ACM SIGGRAPH 2023 Conference Proceedings. 1–11.
- Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125–1134.
- Categorical Reparameterization with Gumbel-Softmax. In International Conference on Learning Representations.
- Object-Centric Slot Diffusion. NeurIPS (2023).
- Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110–8119.
- Self Tuning Texture Optimization. Comput. Graph. Forum 34, 2 (May 2015), 349–359. https://doi.org/10.1111/cgf.12565
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Style transfer by relaxed optimal transport and self-similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10051–10060.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).
- Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.
- Texture Optimization for Example-Based Synthesis. ACM Trans. Graph. 24, 3 (jul 2005), 795–802. https://doi.org/10.1145/1073204.1073263
- Graphcut textures: Image and video synthesis using graph cuts. Acm transactions on graphics (tog) 22, 3 (2003), 277–286.
- Sylvain Lefebvre and Hugues Hoppe. 2005. Parallel controllable texture synthesis. In ACM SIGGRAPH 2005 Papers. 777–786.
- Sylvain Lefebvre and Hugues Hoppe. 2006. Appearance-space Texture Synthesis. In ACM SIGGRAPH 2006 Papers (Boston, Massachusetts) (SIGGRAPH ’06). ACM, New York, NY, USA, 541–548. https://doi.org/10.1145/1179352.1141921
- End-to-End Procedural Material Capture with Proxy-Free Mixed-Integer Optimization. ACM Transactions on Graphics (TOG) 42, 4 (2023), 1–15.
- Scraping Textures from Natural Images for Synthesis and Editing. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XV. Springer, 391–408.
- Near-regular texture analysis and manipulation. ACM Transactions on Graphics (TOG) 23, 3 (2004), 368–376.
- Texture splicing. In Computer Graphics Forum, Vol. 28. Wiley Online Library, 1907–1915.
- Object-centric learning with slot attention. Advances in Neural Information Processing Systems 33 (2020), 11525–11538.
- Dominant texture and diffusion distance manifolds. In Computer Graphics Forum, Vol. 28. Wiley Online Library, 667–676.
- Texture Design Using a Simplicial Complex of Morphable Textures. ACM Trans. Graph. 24, 3 (July 2005), 787–794. https://doi.org/10.1145/1073204.1073262
- Meta. 2023. Segment Anything. https://segment-anything.com/
- Image segmentation using deep learning: A survey. IEEE transactions on pattern analysis and machine intelligence 44, 7 (2021), 3523–3542.
- Charlie Nash and Christopher KI Williams. 2017. The shape variational autoencoder: A deep generative model of part-segmented 3D objects. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 1–12.
- Michael Niemeyer and Andreas Geiger. 2021. Giraffe: Representing scenes as compositional generative neural feature fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11453–11464.
- Neural Scene Graphs for Dynamic Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2856–2865.
- Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2337–2346.
- Swapping autoencoder for deep image manipulation. Advances in Neural Information Processing Systems 33 (2020), 7198–7211.
- Layered Shape Synthesis: Automatic Generation of Control Maps for Non-Stationary Textures. ACM Trans. Graph. 28, 5 (Dec. 2009), 1–9. https://doi.org/10.1145/1618452.1618453
- Object scene representation transformer. Advances in Neural Information Processing Systems 35 (2022), 9512–9524.
- Match: Differentiable material graphs for procedural material capture. ACM Transactions on Graphics (TOG) 39, 6 (2020), 1–15.
- Splicing vit features for semantic appearance transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10748–10757.
- State of the art in example-based texture synthesis. In Eurographics 2009, State of the Art Report, EG-STAR. Eurographics Association, 93–117.
- Deep texture manifold for ground terrain recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 558–567.
- Superpixel segmentation with fully convolutional networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13964–13973.
- Learning texture transformer network for image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5791–5800.
- Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Deep structure-revealed network for texture recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11010–11019.
- Deep ten: Texture encoding network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 708–717.
- The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.
- Unsupervised discovery of object landmarks as structural representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2694–2703.
- Image super-resolution by neural texture transfer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7982–7991.
- Deep learning-based human pose estimation: A survey. Comput. Surveys 56, 1 (2023), 1–37.
- Crossnet: An end-to-end reference-based super resolution network using cross-scale warping. In Proceedings of the European conference on computer vision (ECCV). 88–104.
- Neural Texture Synthesis With Guided Correspondence. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18095–18104.
- Non-stationary texture synthesis by adversarial expansion. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–13.