Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 95 TPS
Gemini 2.5 Pro 47 TPS Pro
GPT-5 Medium 29 TPS
GPT-5 High 33 TPS Pro
GPT-4o 102 TPS
GPT OSS 120B 471 TPS Pro
Kimi K2 192 TPS Pro
2000 character limit reached

InfinityGAN: Towards Infinite-Pixel Image Synthesis (2104.03963v4)

Published 8 Apr 2021 in cs.CV

Abstract: We present a novel framework, InfinityGAN, for arbitrary-sized image generation. The task is associated with several key challenges. First, scaling existing models to an arbitrarily large image size is resource-constrained, in terms of both computation and availability of large-field-of-view training data. InfinityGAN trains and infers in a seamless patch-by-patch manner with low computational resources. Second, large images should be locally and globally consistent, avoid repetitive patterns, and look realistic. To address these, InfinityGAN disentangles global appearances, local structures, and textures. With this formulation, we can generate images with spatial size and level of details not attainable before. Experimental evaluation validates that InfinityGAN generates images with superior realism compared to baselines and features parallelizable inference. Finally, we show several applications unlocked by our approach, such as spatial style fusion, multi-modal outpainting, and image inbetweening. All applications can be operated with arbitrary input and output sizes. Please find the full version of the paper at https://openreview.net/forum?id=ufGMqIM0a4b .

Citations (66)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper presents a novel patch-by-patch image synthesis method that efficiently generates arbitrary-sized images while ensuring local and global texture consistency.
  • The paper disentangles appearances, structures, and textures, achieving high-fidelity realism and enabling parallelizable inference.
  • The paper validates its approach with the ScaleInv FID metric, demonstrating superior performance in realism and structural coherence compared to previous models.

InfinityGAN: Arbitrary-Sized Image Synthesis

InfinityGAN introduces a novel framework for generating images of arbitrary size, addressing key challenges associated with scaling existing models. Traditional generative models struggle to produce sufficiently large images due to computational constraints and the limitations inherent in available training data. InfinityGAN offers a solution by enabling seamless image generation through a patch-by-patch manner, requiring minimal computational resources. This approach allows InfinityGAN to maintain both local and global consistencies in image textures while avoiding repetitive patterns. Through the disentanglement of appearances, structures, and textures, InfinityGAN achieves levels of detail previously unattainable with earlier models.

Key Contributions

The paper presents several notable advancements:

  1. Seamless Patch-by-Patch Generation: InfinityGAN generates images by subdividing them into patches, ensuring consistency without the need for extensive computational resources.
  2. Disentangled Image Characteristics: By separating the global appearance from local structures and textures, InfinityGAN achieves high fidelity and realism.
  3. Parallelizable Inference: InfinityGAN supports parallel processing, enhancing inference speed without compromising quality.
  4. Applications: InfinityGAN enables diverse applications, such as spatial style fusion, multi-modal outpainting, and image inbetweening, all operable at various input and output sizes.

Experimental Results

Extensive qualitative and quantitative experiments validate the performance of InfinityGAN. The model not only surpasses existing baselines in terms of realism and structural coherence but also showcases the practical advantages of its formulation in generating large-scale, realistic images. The introduction of ScaleInv FID, a novel metric for evaluating image synthesis at scales larger than training data, confirms the superior capability of InfinityGAN against previous models, maintaining consistent performance as the field of view expands.

Implications and Future Directions

InfinityGAN holds significant implications both theoretically and practically. Theoretically, it demonstrates the feasibility of generating infinite-pixel images by leveraging neural implicit functions and padding-free convolutional networks. Practically, it opens doors to applications requiring high-resolution images with minimal computational overhead, such as virtual reality environments, digital content creation, and expansive artistic designs.

The work also provides a stepping stone for future research in generative model scalability and efficiency. Areas for potential development include optimizing training efficiencies for even larger datasets, enhancing the integration of neural implicit functions for diverse generative tasks, and improving the model's capacity to synthesize images across varied themes beyond landscapes.

In summary, InfinityGAN presents a significant advancement in the domain of image generation, establishing a robust framework for infinite-pixel image synthesis while maintaining computational efficiency and adaptability across a multitude of applications.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com