Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robustly overfitting latents for flexible neural image compression (2401.17789v3)

Published 31 Jan 2024 in cs.CV, cs.LG, and stat.ML

Abstract: Neural image compression has made a great deal of progress. State-of-the-art models are based on variational autoencoders and are outperforming classical models. Neural compression models learn to encode an image into a quantized latent representation that can be efficiently sent to the decoder, which decodes the quantized latent into a reconstructed image. While these models have proven successful in practice, they lead to sub-optimal results due to imperfect optimization and limitations in the encoder and decoder capacity. Recent work shows how to use stochastic Gumbel annealing (SGA) to refine the latents of pre-trained neural image compression models. We extend this idea by introducing SGA+, which contains three different methods that build upon SGA. We show how our method improves the overall compression performance in terms of the R-D trade-off, compared to its predecessors. Additionally, we show how refinement of the latents with our best-performing method improves the compression performance on both the Tecnick and CLIC dataset. Our method is deployed for a pre-trained hyperprior and for a more flexible model. Further, we give a detailed analysis of our proposed methods and show that they are less sensitive to hyperparameter choices. Finally, we show how each method can be extended to three- instead of two-class rounding.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Soft-to-hard vector quantization for end-to-end learning compressible representations. Advances in neural information processing systems, 30, 2017.
  2. Scale-space flow for end-to-end optimized video compression. In IEEE conference on Computer Vision and Pattern Recognition, 2020.
  3. TESTIMAGES: A large-scale archive for testing visual devices and basic image processing algorithms (SAMPLING 1200 RGB set). In STAG: Smart Tools and Apps for Graphics, 2014. URL https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip.
  4. Block-optimized variable bit rate neural image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.  2551–2554, 2018.
  5. End-to-end optimized image compression. International Conference on Learning Representations, 2017.
  6. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436, 2018.
  7. Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029, 2020.
  8. Bellard, F. BPG specification, 2014. URL https://bellard.org/bpg/bpg_spec.txt. (accessed June 3, 2020).
  9. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
  10. Content adaptive optimization for neural image compression. arXiv preprint arXiv:1906.01223, 2019.
  11. Inference suboptimality in variational autoencoders. In International Conference on Machine Learning, pp.  1078–1086. PMLR, 2018.
  12. COIN: compression with implicit neural representations. CoRR, abs/2103.03123, 2021.
  13. Image compression with product quantized masked image modeling. Trans. Mach. Learn. Res., 2023, 2023.
  14. Variable rate image compression with content adaptive optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp.  122–123, 2020.
  15. Soft then hard: Rethinking the quantization in neural image compression. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.  3920–3929. PMLR, 18–24 Jul 2021.
  16. Video compression with rate-distortion autoencoders. In IEEE International Conference on Computer Vision, 2019.
  17. ELIC: efficient learned image compression with unevenly grouped space-channel contextual adaptive coding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pp.  5708–5717. IEEE, 2022. doi: 10.1109/CVPR52688.2022.00563.
  18. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
  19. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  20. Kodak, E. Kodak lossless true color image suite (PhotoCD PCD0992). URL http://r0k.us/graphics/kodak.
  21. Context-adaptive entropy model for end-to-end optimized image compression. In the 7th Int. Conf. on Learning Representations, May 2019.
  22. Dvc: An end-to-end deep video compression framework. In IEEE conference on Computer Vision and Pattern Recognition, 2019.
  23. Content adaptive and error propagation aware deep video compression. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pp.  456–472. Springer, 2020.
  24. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems, 31, 2018.
  25. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi: 10.1007/s11263-015-0816-y.
  26. The jpeg 2000 still image compression standard. IEEE Signal Processing Magazine, 2001.
  27. Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395, 2017.
  28. Full resolution image compression with recurrent neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.  5306–5314, 2017.
  29. Workshop and challenge on learned image compression (clic2020), 2020. URL http://www.compression.cc.
  30. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
  31. Overfitting for fun and profit: Instance-adaptive data compression. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
  32. Wallace, G. K. The jpeg still picture compression standard. IEEE transactions on consumer electronics, 38(1):xviii–xxxiv, 1992.
  33. Improving inference for neural image compression. Advances in Neural Information Processing Systems, 33:573–584, 2020.
  34. Understanding straight-through estimator in training activation quantized neural nets. arXiv preprint arXiv:1903.05662, 2019.
  35. Implicit neural video compression. CoRR, abs/2112.11312, 2021.
  36. Transformer-based transform coding. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.