Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Local Padding in Patch-Based GANs for Seamless Infinite-Sized Texture Synthesis (2309.02340v5)

Published 5 Sep 2023 in cs.CV and eess.IV

Abstract: Texture models based on Generative Adversarial Networks (GANs) use zero-padding to implicitly encode positional information of the image features. However, when extending the spatial input to generate images at large sizes, zero-padding can often lead to degradation in image quality due to the incorrect positional information at the center of the image. Moreover, zero-padding can limit the diversity within the generated large images. In this paper, we propose a novel approach for generating stochastic texture images at large arbitrary sizes using GANs based on patch-by-patch generation. Instead of zero-padding, the model uses \textit{local padding} in the generator that shares border features between the generated patches; providing positional context and ensuring consistency at the boundaries. The proposed models are trainable on a single texture image and have a constant GPU scalability with respect to the output image size, and hence can generate images of infinite sizes. We show in the experiments that our method has a significant advancement beyond existing GANs-based texture models in terms of the quality and diversity of the generated textures. Furthermore, the implementation of local padding in the state-of-the-art super-resolution models effectively eliminates tiling artifacts enabling large-scale super-resolution. Our code is available at \url{https://github.com/ai4netzero/Infinite_Texture_GANs}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014.
  2. N. Jetchev, U. Bergmann, and R. Vollgraf, “Texture synthesis with spatial generative adversarial networks,” arXiv preprint arXiv:1611.08207, 2016.
  3. U. Bergmann, N. Jetchev, and R. Vollgraf, “Learning texture manifolds with the periodic spatial gan,” arXiv preprint arXiv:1705.06566, 2017.
  4. Y. Zhou, Z. Zhu, X. Bai, D. Lischinski, D. Cohen-Or, and H. Huang, “Non-stationary texture synthesis by adversarial expansion,” arXiv preprint arXiv:1805.04487, 2018.
  5. T. R. Shaham, T. Dekel, and T. Michaeli, “Singan: Learning a generative model from a single natural image,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4570–4580, 2019.
  6. T. de Bel, M. Hermsen, J. Kers, J. van der Laak, and G. Litjens, “Stain-transforming cycle-consistent generative adversarial networks for improved segmentation of renal histopathology,” 2018.
  7. M. Böhland, R. Bruch, S. Bäuerle, L. Rettenberger, and M. Reischl, “Improving generative adversarial networks for patch-based unpaired image-to-image translation,” IEEE Access, vol. 11, pp. 127895–127906, 2023.
  8. X. Wang, L. Xie, C. Dong, and Y. Shan, “Real-esrgan: Training real-world blind super-resolution with pure synthetic data,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 1905–1914, 2021.
  9. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, pp. 2223–2232, 2017.
  10. Y. Tian, X. Peng, L. Zhao, S. Zhang, and D. N. Metaxas, “Cr-gan: learning complete representations for multi-view generation,” arXiv preprint arXiv:1806.11191, 2018.
  11. L. Zhao, X. Peng, Y. Tian, M. Kapadia, and D. N. Metaxas, “Towards image-to-video translation: A structure-aware approach via multi-stage generative adversarial networks,” International Journal of Computer Vision, vol. 128, pp. 2514–2533, 2020.
  12. C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681–4690, 2017.
  13. K. Zhang, J. Liang, L. Van Gool, and R. Timofte, “Designing a practical degradation model for deep blind image super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4791–4800, 2021.
  14. D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, “Context encoders: Feature learning by inpainting,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2536–2544, 2016.
  15. J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Generative image inpainting with contextual attention,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5505–5514, 2018.
  16. A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
  17. A. Frühstück, I. Alhashim, and P. Wonka, “Tilegan: synthesis of large-scale non-homogeneous textures,” ACM Transactions on Graphics (ToG), vol. 38, no. 4, pp. 1–11, 2019.
  18. A. Brock, J. Donahue, and K. Simonyan, “Large scale GAN training for high fidelity natural image synthesis,” arXiv preprint arXiv:1809.11096, 2018.
  19. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4401–4410, 2019.
  20. T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of stylegan,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8110–8119, 2020.
  21. T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, and T. Aila, “Alias-free generative adversarial networks,” Advances in Neural Information Processing Systems, vol. 34, 2021.
  22. L. Zhao, Z. Zhang, T. Chen, D. Metaxas, and H. Zhang, “Improved transformer for high-resolution gans,” Advances in Neural Information Processing Systems, vol. 34, pp. 18367–18380, 2021.
  23. H. Pinckaers, B. Van Ginneken, and G. Litjens, “Streaming convolutional neural networks for end-to-end learning with multi-megapixel images,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 3, pp. 1581–1590, 2020.
  24. C. H. Lin, C.-C. Chang, Y.-S. Chen, D.-C. Juan, W. Wei, and H.-T. Chen, “COCO-GAN: Generation by parts via conditional coordinating,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 4512–4521, 2019.
  25. C. H. Lin, H.-Y. Lee, Y.-C. Cheng, S. Tulyakov, and M.-H. Yang, “Infinitygan: Towards infinite-pixel image synthesis,” arXiv preprint arXiv:2104.03963, 2021.
  26. R. Liu, J. Lehman, P. Molino, F. Petroski Such, E. Frank, A. Sergeev, and J. Yosinski, “An intriguing failing of convolutional neural networks and the coordconv solution,” Advances in neural information processing systems, vol. 31, 2018.
  27. Ł. Struski, S. Knop, P. Spurek, W. Daniec, and J. Tabor, “Locogan—locally convolutional gan,” Computer Vision and Image Understanding, vol. 221, p. 103462, 2022.
  28. I. Skorokhodov, G. Sotnikov, and M. Elhoseiny, “Aligning latent and image spaces to connect the unconnectable,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14144–14153, 2021.
  29. X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE international conference on computer vision, pp. 1501–1510, 2017.
  30. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
  31. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134, 2017.
  32. T. Park, M.-Y. Liu, T.-C. Wang, and J.-Y. Zhu, “Semantic image synthesis with spatially-adaptive normalization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337–2346, 2019.
  33. T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial networks,” arXiv preprint arXiv:1802.05957, 2018.
  34. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

Summary

We haven't generated a summary for this paper yet.