Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generative Adversarial Networks for Extreme Learned Image Compression (1804.02958v3)

Published 9 Apr 2018 in cs.CV and cs.LG
Generative Adversarial Networks for Extreme Learned Image Compression

Abstract: We present a learned image compression system based on GANs, operating at extremely low bitrates. Our proposed framework combines an encoder, decoder/generator and a multi-scale discriminator, which we train jointly for a generative learned compression objective. The model synthesizes details it cannot afford to store, obtaining visually pleasing results at bitrates where previous methods fail and show strong artifacts. Furthermore, if a semantic label map of the original image is available, our method can fully synthesize unimportant regions in the decoded image such as streets and trees from the label map, proportionally reducing the storage cost. A user study confirms that for low bitrates, our approach is preferred to state-of-the-art methods, even when they use more than double the bits.

Generative Adversarial Networks for Extreme Learned Image Compression

The paper presents a novel image compression framework utilizing Generative Adversarial Networks (GANs) to achieve visually pleasing results at extremely low bitrates, below 0.1 bits per pixel (bpp). The system integrates an encoder, decoder/generator, and a multi-scale discriminator, collectively optimized for a generative compression objective. This technique shows promise in synthesizing details that traditional methods would not store effectively, outperforming state-of-the-art codecs in visual assessments.

Key Contributions

  1. Framework Development: The researchers propose a principled GAN framework tailored specifically for extreme image compression at very low bitrates. This framework leverages adversarial losses to enhance the visual quality of generated images by capturing global semantics and local textures effectively.
  2. Generative and Selective Compression: The system operates in two distinct modes:
    • Generative Compression (GC): This mode focuses on preserving overall image content while generating the texture and detail that cannot be stored.
    • Selective Generative Compression (SC): Here, the system generates specific image regions from semantic label maps, significantly reducing storage requirements while maintaining important regions with high detail.
  3. User Study Validation: The paper includes a thorough user paper comparing the proposed method against existing effective codecs like BPG and a baseline autoencoder-based system. Results indicate user preference for the GAN-based reconstructions at similar or lower bitrates.

Experimental Results

The research details a rigorous evaluation across different datasets including Kodak, RAISE1K, and Cityscapes. For Kodak and RAISE1K, the GC models demonstrated significant bitrate savings—BPG needed between 95% to 181% more bits to achieve comparable visual quality in user studies. For Cityscapes, the GC models with semantic information performed even better.

In addition, the SC method showed a remarkable ability to reduce bitrates by synthesizing specific regions from semantic label maps, maintaining overall scene fidelity.

Implications and Future Directions

This work not only advances the domain of learned image compression but also highlights the potential application of GANs beyond traditional visual criteria. The compression system could significantly impact areas where visual quality at constrained bitrates is paramount, such as in bandwidth-limited scenarios or specific applications like video calls.

Future developments could explore integrating semantic awareness for better bit allocation, optimizing encoding strategies, and improving GAN training stability to further enhance texture synthesis. Potential improvements in GAN architectures could lead to even more realistic image generation, widening the application scope of such methods in compressing high-detail visuals efficiently.

Conclusion

The research presents a compelling case for the application of GANs in image compression, delivering notable performance gains in preserving image quality at extremely low bitrates. This innovative approach offers a new perspective on balancing lossy compression with perceptual quality, providing strong foundational work for future explorations into advanced image synthesis and compression techniques.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Eirikur Agustsson (27 papers)
  2. Michael Tschannen (49 papers)
  3. Fabian Mentzer (19 papers)
  4. Radu Timofte (299 papers)
  5. Luc Van Gool (569 papers)
Citations (523)