Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiscale Augmented Normalizing Flows for Image Compression (2305.05451v3)

Published 9 May 2023 in eess.IV and cs.CV

Abstract: Most learning-based image compression methods lack efficiency for high image quality due to their non-invertible design. The decoding function of the frequently applied compressive autoencoder architecture is only an approximated inverse of the encoding transform. This issue can be resolved by using invertible latent variable models, which allow a perfect reconstruction if no quantization is performed. Furthermore, many traditional image and video coders apply dynamic block partitioning to vary the compression of certain image regions depending on their content. Inspired by this approach, hierarchical latent spaces have been applied to learning-based compression networks. In this paper, we present a novel concept, which adapts the hierarchical latent space for augmented normalizing flows, an invertible latent variable model. Our best performing model achieved average rate savings of more than 7% over comparable single-scale models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. “Overview of the versatile video coding (VVC) standard and its applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021.
  2. “End-to-end optimized image compression,” in Proc. International Conference on Learning Representations ICLR, 2017.
  3. “Checkerboard context model for efficient learned image compression,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  4. “Learned image compression with gaussian-laplacian-logistic mixture model and concatenated residual modules,” IEEE Transactions on Image Processing, vol. 32, pp. 2063–2076, 2023.
  5. “Learned image compression with mixed transformer-cnn architectures,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  6. “Augmented normalizing flows: Bridging the gap between generative flows and latent variable models,” ArXiv, vol. abs/2002.07101, 2020.
  7. “Rate-distortion optimized learning-based image compression using an adaptive hierachical autoencoder with conditional hyperprior,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 1885–1889.
  8. “Learning true rate-distortion-optimization for end-to-end image compression,” in Proc. Data Compression Conference (DCC), 2022, pp. 443–443.
  9. “RDONet: Rate-distortion optimized learned image compression with variable depth,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2022, pp. 1758–1762.
  10. “ANFIC: Image compression using augmented normalizing flows,” IEEE Open Journal of Circuits and Systems, vol. 2, pp. 613–626, 2021.
  11. “Density modeling of images using a generalized normalization transformation,” in Proc. International Conference on Learning Representations ICLR, 2016.
  12. “Variational image compression with a scale hyperprior,” in Proc. International Conference on Learning Representations (ICLR), 2018, pp. 1–47.
  13. “Joint autoregressive and hierarchical priors for learned image compression,” in Advances in Neural Information Processing Systems, 2018, vol. 31, pp. 1–10.
  14. “Learned image compression with discretized Gaussian mixture likelihoods and attention modules,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7936–7945.
  15. “End-to-end optimized versatile image compression with wavelet-like transform,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, pp. 1247–1263, 2022.
  16. “Video enhancement with task-oriented flow,” International Journal of Computer Vision, vol. 127, no. 8, pp. 1106–1125, 2019.
  17. “Adam: A method for stochastic optimization,” in Proc. International Conference on Learning Representations (ICLR), 2014.
  18. “TESTIMAGES: a Large-scale Archive for Testing Visual Devices and Basic Image Processing Algorithms,” in Proc. Smart Tools and Apps for Graphics - Eurographics Italian Chapter Conference, 2014.
  19. “Multiscale structural similarity for image quality assessment,” in Proc. Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, pp. 1398–1402.
  20. Gisle Bjøntegaard, “Calculation of average PSNR differences between RD-curves, VCEG-M33,” 13th Meeting of the Video Coding Experts Group (VCEG), pp. 1–5, Jan 2001.
  21. “Saliency-driven hierarchical learned image coding for machines,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com