Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Soft then Hard: Rethinking the Quantization in Neural Image Compression (2104.05168v4)

Published 12 Apr 2021 in eess.IV

Abstract: Quantization is one of the core components in lossy image compression. For neural image compression, end-to-end optimization requires differentiable approximations of quantization, which can generally be grouped into three categories: additive uniform noise, straight-through estimator and soft-to-hard annealing. Training with additive uniform noise approximates the quantization error variationally but suffers from the train-test mismatch. The other two methods do not encounter this mismatch but, as shown in this paper, hurt the rate-distortion performance since the latent representation ability is weakened. We thus propose a novel soft-then-hard quantization strategy for neural image compression that first learns an expressive latent space softly, then closes the train-test mismatch with hard quantization. In addition, beyond the fixed integer quantization, we apply scaled additive uniform noise to adaptively control the quantization granularity by deriving a new variational upper bound on actual rate. Experiments demonstrate that our proposed methods are easy to adopt, stable to train, and highly effective especially on complex compression models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Universally quantized neural compression. Advances in Neural Information Processing Systems, 33, 2020.
  2. Soft-to-hard vector quantization for end-to-end learning compressible representations. In Advances in Neural Information Processing Systems, pp. 1141–1151, 2017.
  3. End-to-end optimized image compression. arXiv preprint arXiv:1611.01704, 2016.
  4. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436, 2018.
  5. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
  6. Rethinking lossy compression: The rate-distortion-perception tradeoff. In International Conference on Machine Learning, pp. 675–685. PMLR, 2019.
  7. Variational lossy autoencoder. arXiv preprint arXiv:1611.02731, 2016.
  8. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  7939–7948, 2020.
  9. Variable rate deep image compression with a conditional autoencoder. In Proceedings of the IEEE International Conference on Computer Vision, pp.  3146–3154, 2019.
  10. Diagnosing and enhancing vae models. arXiv preprint arXiv:1903.05789, 2019.
  11. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  12. Compressing images by encoding their latent representations with relative entropy coding. Advances in Neural Information Processing Systems, 33, 2020.
  13. From variational to deterministic autoencoders. In International Conference on Learning Representations, 2019.
  14. Goyal, V. K. Theoretical foundations of transform coding. IEEE Signal Processing Magazine, 18(5):9–21, 2001.
  15. Quantization. IEEE transactions on information theory, 44(6):2325–2383, 1998.
  16. 3-d context entropy model for improved practical image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp.  116–117, 2020.
  17. Minimal random code learning: Getting bits back from compressed model parameters. In 7th International Conference on Learning Representations, ICLR 2019, 2019.
  18. Reducing the dimensionality of data with neural networks. science, 313(5786):504–507, 2006.
  19. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In International Conference on Machine Learning, pp. 2722–2730, 2019.
  20. Integer discrete flows and lossless compression. In Advances in Neural Information Processing Systems, pp. 12134–12144, 2019.
  21. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
  22. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  23. Improved variational inference with inverse autoregressive flow. Advances in neural information processing systems, 29:4743–4751, 2016.
  24. Kodak, E. Kodak Lossless True Color Image Suite (PhotoCD PCD0992). http://r0k.us/graphics/kodak/, 1993.
  25. Context-adaptive entropy model for end-to-end optimized image compression. In the 7th Int. Conf. on Learning Representations, May 2019.
  26. Learning convolutional networks for content-weighted image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  3214–3223, 2018.
  27. Conditional probability models for deep image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  4394–4402, 2018.
  28. High-fidelity generative image compression. Advances in Neural Information Processing Systems, 33, 2020.
  29. Joint autoregressive and hierarchical priors for learned image compression. In Advances in Neural Information Processing Systems, pp. 10771–10780, 2018.
  30. Generating diverse high-fidelity images with vq-vae-2. Advances in Neural Information Processing Systems, 32:14866–14876, 2019.
  31. Real-time adaptive image compression. In International Conference on Machine Learning, pp. 2922–2930. PMLR, 2017.
  32. Versatile video coding towards the next generation of video compression. In 2018 Picture Coding Symposium (PCS), 2018.
  33. Overview of the high efficiency video coding (hevc) standard. IEEE Transactions on circuits and systems for video technology, 22(12):1649–1668, 2012.
  34. Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395, 2017.
  35. Variable rate image compression with recurrent neural networks. arXiv preprint arXiv:1511.06085, 2015.
  36. Full resolution image compression with recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  5306–5314, 2017.
  37. {IDF}++: Analyzing and improving integer discrete flows for lossless compression. In International Conference on Learning Representations, 2021.
  38. Neural discrete representation learning. In Advances in Neural Information Processing Systems, pp. 6306–6315, 2017.
  39. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  40. Pixel recurrent neural networks. In International Conference on Machine Learning, pp. 1747–1756, 2016.
  41. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  42. Hierarchical quantized autoencoders. Advances in Neural Information Processing Systems, 33, 2020.
  43. Arithmetic coding for data compression. Communications of the ACM, 30(6):520–540, 1987.
  44. Improving inference for neural image compression. arXiv, pp.  arXiv–2006, 2020.
  45. Understanding straight-through estimator in training activation quantized neural nets. In International Conference on Learning Representations, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zongyu Guo (19 papers)
  2. Zhizheng Zhang (60 papers)
  3. Runsen Feng (15 papers)
  4. Zhibo Chen (176 papers)
Citations (75)