Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing the Rate-Distortion-Perception Flexibility of Learned Image Codecs with Conditional Diffusion Decoders (2403.02887v1)

Published 5 Mar 2024 in cs.CV and eess.IV

Abstract: Learned image compression codecs have recently achieved impressive compression performances surpassing the most efficient image coding architectures. However, most approaches are trained to minimize rate and distortion which often leads to unsatisfactory visual results at low bitrates since perceptual metrics are not taken into account. In this paper, we show that conditional diffusion models can lead to promising results in the generative compression task when used as a decoder, and that, given a compressed representation, they allow creating new tradeoff points between distortion and perception at the decoder side based on the sampling method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Gregory K Wallace, “The jpeg still picture compression standard,” IEEE transactions on consumer electronics, vol. 38, no. 1, pp. xviii–xxxiv, 1992.
  2. “Variational image compression with a scale hyperprior,” in Proc. of ICLR 2018, 2018.
  3. “Joint autoregressive and hierarchical priors for learned image compression,” Advances in neural information processing systems, vol. 31, 2018.
  4. “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” in Proceedings of IEEE/CVF CVPR 2020, 2020, pp. 7939–7948.
  5. “Rethinking lossy compression: The rate-distortion-perception tradeoff,” in International Conference on Machine Learning. PMLR, 2019, pp. 675–685.
  6. “Generative adversarial networks for extreme learned image compression,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 221–231.
  7. “High-fidelity generative image compression,” Advances in Neural Information Processing Systems, vol. 33, pp. 11913–11924, 2020.
  8. “Lossy compression with gaussian diffusion,” arXiv preprint arXiv:2206.08889, 2022.
  9. “Extreme generative image compression by learning text embedding from diffusion models,” arXiv preprint arXiv:2211.07793, 2022.
  10. “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
  11. “Denoising diffusion probabilistic models,” Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851, 2020.
  12. “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
  13. “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
  14. “Denoising diffusion restoration models,” arXiv preprint arXiv:2201.11793, 2022.
  15. “Lossy image compression with conditional diffusion models,” arXiv preprint arXiv:2209.06950, 2022.
  16. “Joint global and local hierarchical priors for learned image compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5992–6001.
  17. “Features denoising for learned image coding,” in 2022 10th European Workshop on Visual Information Processing (EUVIP). IEEE, 2022, pp. 1–6.
  18. “Strong functional representation lemma and applications to coding theorems,” IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 6967–6978, 2018.
  19. “Compressai: a pytorch library and evaluation platform for end-to-end compression research,” arXiv preprint arXiv:2011.03029, 2020.
  20. “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
  21. “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
  22. “Workshop and challenge on learned image compression (clic2020),” 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Daniele Mari (7 papers)
  2. Simone Milani (19 papers)