Transformations in Learned Image Compression from a Modulation Perspective (2203.02158v3)
Abstract: In this paper, a unified transformation method in learned image compression(LIC) is proposed from the perspective of modulation. Firstly, the quantization in LIC is considered as a generalized channel with additive uniform noise. Moreover, the LIC is interpreted as a particular communication system according to the consistency in structures and optimization objectives. Thus, the technology of communication systems can be applied to guide the design of modules in LIC. Furthermore, a unified transform method based on signal modulation (TSM) is defined. In the view of TSM, the existing transformation methods are mathematically reduced to a linear modulation. A series of transformation methods, e.g. TPM and TJM, are obtained by extending to nonlinear modulation. The experimental results on various datasets and backbone architectures verify that the effectiveness and robustness of the proposed method. More importantly, it further confirms the feasibility of guiding LIC design from a communication perspective. For example, when backbone architecture is hyperprior combining context model, our method achieves 3.52$\%$ BD-rate reduction over GDN on Kodak dataset without increasing complexity.
- Soft-to-hard vector quantization for end-to-end learning compressible representations. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, pages 1141–1151, 2017.
- Learned variable-rate image compression with residual divisive normalization. In 2020 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2020.
- Nonlinear transform coding. IEEE Journal of Selected Topics in Signal Processing, 15(2):339–353, 2020.
- Density modeling of images using a generalized normalization transformation. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
- End-to-end optimization of nonlinear transform codes for perceptual quality. In 2016 Picture Coding Symposium (PCS), pages 1–5. IEEE, 2016.
- End-to-end optimized image compression. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
- Variational image compression with a scale hyperprior. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, 2018.
- Johannes Ballé. Efficient nonlinear transforms for lossy image compression. In 2018 Picture Coding Symposium (PCS), pages 248–252, 2018.
- Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029, 2020.
- Overview of the versatile video coding (vvc) standard and its applications. IEEE Transactions on Circuits and Systems for Video Technology, 2021.
- Deep image compression with iterative non-uniform quantization. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 451–455. IEEE, 2018.
- End-to-end learnt image compression via non-local attention optimization and improved context modeling. IEEE Transactions on Image Processing, 30:3179–3191, 2021.
- Data-rate driven transmission strategies for deep learning-based communication systems. IEEE Transactions on Communications, 68:2129–2142, 2020.
- Learning image and video compression through spatial-temporal energy compaction. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 10071–10080, 2019.
- Energy compaction-based image compression using convolutional autoencoder. IEEE Transactions on Multimedia, 22(4):860–873, 2020.
- Learned image compression with discretized gaussian mixture likelihoods and attention modules. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 7936–7945, 2020.
- Asymmetric gained deep image compression with continuous rate adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10532–10541, June 2021.
- Ofdm-autoencoder for end-to-end learning of communications systems. In 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pages 1–5, 2018.
- Neural image compression via attentional multi-scale back projection and frequency decomposition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14677–14686, October 2021.
- Workshop and challenge on learned image compression (clic2020), 2020.
- Soft then hard: Rethinking the quantization in neural image compression. In ICML, 2021.
- Machine learning in the air. IEEE Journal on Selected Areas in Communications, 37:2184–2199, 2019.
- Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. 2015 IEEE International Conference on Computer Vision (ICCV), pages 1026–1034, 2015.
- Anfic: Image compression using augmented normalizing flows. arXiv preprint arXiv:2107.08470, 2021.
- Coarse-to-fine hyper-prior modeling for learned image compression. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, pages 11013–11020, 2020.
- Learning end-to-end lossy image compression: A benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
- Auto-encoding variational bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
- Shrinkage as activation for learned image compression. In 2020 IEEE International Conference on Image Processing (ICIP), pages 1301–1305, 2020.
- Learning convolutional networks for content-weighted image compression. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 3214–3223, 2018.
- End-to-end optimized versatile image compression with wavelet-like transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- An overview of jpeg-2000. In Proceedings DCC 2000. Data Compression Conference, pages 523–541. IEEE, 2000.
- Joint autoregressive and hierarchical priors for learned image compression. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 10794–10803, 2018.
- Rectified linear units improve restricted boltzmann machines. In ICML, pages 807–814, 2010.
- An introduction to deep learning for the physical layer. IEEE Transactions on Cognitive Communications and Networking, 3(4):563–575, 2017.
- Learning accurate entropy model with global reference for image compression. In International Conference on Learning Representations, 2021.
- An overview of the jpeg 2000 still image compression standard. Signal processing: Image communication, 17(1):3–48, 2002.
- Overview of the high efficiency video coding (hevc) standard. IEEE Transactions on circuits and systems for video technology, 22(12):1649–1668, 2012.
- A cnn-based end-to-end learning framework toward intelligent communication systems. IEEE Access, 7:110197–110204, 2019.
- Deep learning enabled semantic communication systems. IEEE Transactions on Signal Processing, 69:2663–2675, 2021.
- Enhanced invertible encoding for learned image compression. In Proceedings of the ACM International Conference on Multimedia, 2021.
- Deep learning-based end-to-end wireless communication systems with conditional gans as unknown channels. IEEE Transactions on Wireless Communications, 19:3133–3143.