Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond (2207.14812v1)

Published 29 Jul 2022 in cs.CV

Abstract: We show that pre-trained Generative Adversarial Networks (GANs) such as StyleGAN and BigGAN can be used as a latent bank to improve the performance of image super-resolution. While most existing perceptual-oriented approaches attempt to generate realistic outputs through learning with adversarial loss, our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging rich and diverse priors encapsulated in a pre-trained GAN. But unlike prevalent GAN inversion methods that require expensive image-specific optimization at runtime, our approach only needs a single forward pass for restoration. GLEAN can be easily incorporated in a simple encoder-bank-decoder architecture with multi-resolution skip connections. Employing priors from different generative models allows GLEAN to be applied to diverse categories (\eg~human faces, cats, buildings, and cars). We further present a lightweight version of GLEAN, named LightGLEAN, which retains only the critical components in GLEAN. Notably, LightGLEAN consists of only 21% of parameters and 35% of FLOPs while achieving comparable image quality. We extend our method to different tasks including image colorization and blind image restoration, and extensive experiments show that our proposed models perform favorably in comparison to existing methods. Codes and models are available at https://github.com/open-mmlab/mmediting.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kelvin C. K. Chan (34 papers)
  2. Xiangyu Xu (48 papers)
  3. Xintao Wang (132 papers)
  4. Jinwei Gu (62 papers)
  5. Chen Change Loy (288 papers)
Citations (17)

Summary

  • The paper introduces a novel method that leverages a pre-trained GAN as a latent bank to achieve rapid, high-fidelity image super-resolution.
  • It utilizes an encoder-bank-decoder framework to extract multi-resolution features, significantly cutting runtime compared to traditional GAN inversion techniques.
  • Experiments show that GLEAN, including its streamlined LightGLEAN variant, delivers comparable results with substantially fewer parameters and FLOPs.

Overview of GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond

The paper, "GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond," introduces a novel approach using pre-trained Generative Adversarial Networks (GANs) as a latent bank to enhance image super-resolution. This method diverges from traditional perceptual-oriented techniques by leveraging the rich priors encapsulated in GANs without the need for image-specific optimization during runtime.

Methodology

The proposed technique, GLEAN, incorporates a pre-trained GAN within an encoder-bank-decoder architecture, allowing a single forward pass for image restoration. This setup significantly reduces computational expense compared to GAN inversion methods. The encoder extracts latent vectors and multi-resolution features from low-resolution inputs, which, combined with the latent vectors, guide a latent bank for texture and detail generation. This unique design is applicable to various categories such as human faces, cats, buildings, and cars by utilizing different generative models.

A notable advancement is the development of LightGLEAN, a streamlined version retaining only the essential components of GLEAN, achieving nearly identical performance but with only 21% of the parameter count and 35% of the FLOPs.

Experiments and Results

The paper presents exhaustive experiments demonstrating GLEAN's superior performance across multiple tasks:

  • Super-Resolution: GLEAN shows remarkable capability in reconstructing high-fidelity images with realistic textures and structures compared to existing methods. It achieves high similarity in ArcFace features, emphasizing its proficiency in preserving identity features for human faces.
  • Colorization and Restoration Tasks: The adaptability of GLEAN to handle image colorization and blind image restoration is highlighted. The method attains favorable outcomes with real-world degraded images by employing randomized degradation during training.

Implications and Future Directions

The integration of GANs as a latent bank introduces a promising direction to improve image restoration efficiency and quality. The potential extension of GLEAN to other tasks opens new avenues for leveraging generative priors in diverse applications. The approach also emphasizes the importance of strong generative priors in overcoming ill-posed image restoration tasks.

Future developments may focus on enhancing the GAN architecture to support more extensive class diversity and handle complex degradations. Investigating larger generative models that encompass broader categories could further generalize the application of GLEAN. Additionally, refining the decoder design and augmentation techniques promises improvements in output quality, computational efficiency, and real-world applicability.

Overall, GLEAN provides a significant contribution to image super-resolution and related fields by innovatively applying GAN-based priors to improve restoration performance while maintaining practical computational demands. Its success suggests a wider potential impact on other areas of computer vision, especially in cases where natural image priors play a critical role in task efficacy.