Soft then Hard: Rethinking the Quantization in Neural Image Compression (2104.05168v4)
Abstract: Quantization is one of the core components in lossy image compression. For neural image compression, end-to-end optimization requires differentiable approximations of quantization, which can generally be grouped into three categories: additive uniform noise, straight-through estimator and soft-to-hard annealing. Training with additive uniform noise approximates the quantization error variationally but suffers from the train-test mismatch. The other two methods do not encounter this mismatch but, as shown in this paper, hurt the rate-distortion performance since the latent representation ability is weakened. We thus propose a novel soft-then-hard quantization strategy for neural image compression that first learns an expressive latent space softly, then closes the train-test mismatch with hard quantization. In addition, beyond the fixed integer quantization, we apply scaled additive uniform noise to adaptively control the quantization granularity by deriving a new variational upper bound on actual rate. Experiments demonstrate that our proposed methods are easy to adopt, stable to train, and highly effective especially on complex compression models.
- Universally quantized neural compression. Advances in Neural Information Processing Systems, 33, 2020.
- Soft-to-hard vector quantization for end-to-end learning compressible representations. In Advances in Neural Information Processing Systems, pp. 1141–1151, 2017.
- End-to-end optimized image compression. arXiv preprint arXiv:1611.01704, 2016.
- Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436, 2018.
- Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
- Rethinking lossy compression: The rate-distortion-perception tradeoff. In International Conference on Machine Learning, pp. 675–685. PMLR, 2019.
- Variational lossy autoencoder. arXiv preprint arXiv:1611.02731, 2016.
- Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7939–7948, 2020.
- Variable rate deep image compression with a conditional autoencoder. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3146–3154, 2019.
- Diagnosing and enhancing vae models. arXiv preprint arXiv:1903.05789, 2019.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- Compressing images by encoding their latent representations with relative entropy coding. Advances in Neural Information Processing Systems, 33, 2020.
- From variational to deterministic autoencoders. In International Conference on Learning Representations, 2019.
- Goyal, V. K. Theoretical foundations of transform coding. IEEE Signal Processing Magazine, 18(5):9–21, 2001.
- Quantization. IEEE transactions on information theory, 44(6):2325–2383, 1998.
- 3-d context entropy model for improved practical image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 116–117, 2020.
- Minimal random code learning: Getting bits back from compressed model parameters. In 7th International Conference on Learning Representations, ICLR 2019, 2019.
- Reducing the dimensionality of data with neural networks. science, 313(5786):504–507, 2006.
- Flow++: Improving flow-based generative models with variational dequantization and architecture design. In International Conference on Machine Learning, pp. 2722–2730, 2019.
- Integer discrete flows and lossless compression. In Advances in Neural Information Processing Systems, pp. 12134–12144, 2019.
- Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Improved variational inference with inverse autoregressive flow. Advances in neural information processing systems, 29:4743–4751, 2016.
- Kodak, E. Kodak Lossless True Color Image Suite (PhotoCD PCD0992). http://r0k.us/graphics/kodak/, 1993.
- Context-adaptive entropy model for end-to-end optimized image compression. In the 7th Int. Conf. on Learning Representations, May 2019.
- Learning convolutional networks for content-weighted image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3214–3223, 2018.
- Conditional probability models for deep image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4394–4402, 2018.
- High-fidelity generative image compression. Advances in Neural Information Processing Systems, 33, 2020.
- Joint autoregressive and hierarchical priors for learned image compression. In Advances in Neural Information Processing Systems, pp. 10771–10780, 2018.
- Generating diverse high-fidelity images with vq-vae-2. Advances in Neural Information Processing Systems, 32:14866–14876, 2019.
- Real-time adaptive image compression. In International Conference on Machine Learning, pp. 2922–2930. PMLR, 2017.
- Versatile video coding towards the next generation of video compression. In 2018 Picture Coding Symposium (PCS), 2018.
- Overview of the high efficiency video coding (hevc) standard. IEEE Transactions on circuits and systems for video technology, 22(12):1649–1668, 2012.
- Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395, 2017.
- Variable rate image compression with recurrent neural networks. arXiv preprint arXiv:1511.06085, 2015.
- Full resolution image compression with recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314, 2017.
- {IDF}++: Analyzing and improving integer discrete flows for lossless compression. In International Conference on Learning Representations, 2021.
- Neural discrete representation learning. In Advances in Neural Information Processing Systems, pp. 6306–6315, 2017.
- Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Pixel recurrent neural networks. In International Conference on Machine Learning, pp. 1747–1756, 2016.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
- Hierarchical quantized autoencoders. Advances in Neural Information Processing Systems, 33, 2020.
- Arithmetic coding for data compression. Communications of the ACM, 30(6):520–540, 1987.
- Improving inference for neural image compression. arXiv, pp. arXiv–2006, 2020.
- Understanding straight-through estimator in training activation quantized neural nets. In International Conference on Learning Representations, 2019.
- Zongyu Guo (19 papers)
- Zhizheng Zhang (60 papers)
- Runsen Feng (15 papers)
- Zhibo Chen (176 papers)