Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization (2402.17470v1)

Published 27 Feb 2024 in cs.CV, cs.LG, and eess.IV

Abstract: Currently, there is a high demand for neural network-based image compression codecs. These codecs employ non-linear transforms to create compact bit representations and facilitate faster coding speeds on devices compared to the hand-crafted transforms used in classical frameworks. The scientific and industrial communities are highly interested in these properties, leading to the standardization effort of JPEG-AI. The JPEG-AI verification model has been released and is currently under development for standardization. Utilizing neural networks, it can outperform the classic codec VVC intra by over 10% BD-rate operating at base operation point. Researchers attribute this success to the flexible bit distribution in the spatial domain, in contrast to VVC intra's anchor that is generated with a constant quality point. However, our study reveals that VVC intra displays a more adaptable bit distribution structure through the implementation of various block sizes. As a result of our observations, we have proposed a spatial bit allocation method to optimize the JPEG-AI verification model's bit distribution and enhance the visual quality. Furthermore, by applying the VVC bit distribution strategy, the objective performance of JPEG-AI verification mode can be further improved, resulting in a maximum gain of 0.45 dB in PSNR-Y.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. G. Wallace, “The JPEG still picture compression standard,” IEEE Transactions on Consumer Electronics, vol. 38, no. 1, pp. xviii–xxxiv, 1992.
  2. A. Skodras, C. Christopoulos, and T. Ebrahimi, “The JPEG 2000 still image compression standard,” IEEE Signal Processing Magazine, vol. 18, no. 5, pp. 36–58, 2001.
  3. G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, 2012.
  4. F. Bellard, “BPG image format,” 2015, accessed: 2021-11-05. URL https://bellard.org/bpg.
  5. B. Bross, Y.-K. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the Versatile Video Coding (VVC) Standard and its Applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021.
  6. J. Ballé, V. Laparra, and E. P. Simoncelli, “End-to-end Optimized Image Compression,” ArXiv, vol. abs/1611.01704, 2017.
  7. J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” ArXiv, vol. abs/1802.01436, 2018.
  8. D. Minnen, J. Ballé, and G. Toderici, “Joint Autoregressive and Hierarchical Priors for Learned Image Compression,” ArXiv, vol. abs/1809.02736, 2018.
  9. Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7939–7948.
  10. A. B. Koyuncu, K. Cui, A. Boev, and E. Steinbach, “Parallelized context modeling for faster image coding,” in 2021 International Conference on Visual Communications and Image Processing (VCIP).   IEEE, 2021, pp. 1–5.
  11. Z. Guo, Z. Zhang, R. Feng, and Z. Chen, “Causal contextual prediction for learned image compression,” IEEE Transactions on Circuits and Systems for Video Technology, 2021.
  12. Y. Qian, X. Sun, M. Lin, Z. Tan, and R. Jin, “Entroformer: A transformer-based entropy model for learned image compression,” in International Conference on Learning Representations, 2021.
  13. A. B. Koyuncu, H. Gao, A. Boev, G. Gaikov, E. Alshina, and E. Steinbach, “Contextformer: A transformer with spatio-channel attention for context modeling in learned image compression,” in European Conference on Computer Vision.   Springer, 2022, pp. 447–463.
  14. D. He, Z. Yang, W. Peng, R. Ma, H. Qin, and Y. Wang, “Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5718–5727.
  15. A. B. Koyuncu, P. Jia, A. Boev, E. Alshina, and E. Steinbach, “Efficient contextformer: Spatio-channel window attention for fast context modeling in learned image compression,” 2023.
  16. J. Liu, H. Sun, and J. Katto, “Learned image compression with mixed transformer-cnn architectures,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 14 388–14 397.
  17. J. Ballé, P. A. Chou, D. Minnen, S. Singh, N. Johnston, E. Agustsson, S. J. Hwang, and G. Toderici, “Nonlinear transform coding,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 2, pp. 339–353, 2020.
  18. F. Brand, K. Fischer, and A. Kaup, “Rate-distortion optimized learning-based image compression using an adaptive hierachical autoencoder with conditional hyperprior,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 1885–1889.
  19. Y. Choi, M. El-Khamy, and J. Lee, “Variable rate deep image compression with a conditional autoencoder,” 2019.
  20. M. Song, J. Choi, and B. Han, “Variable-rate deep image compression through spatially-adaptive feature transform,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2380–2389.
  21. Z. Cui, J. Wang, B. Bai, T. Guo, and Y. Feng, “G-vae: A continuously variable rate deep image compression framework,” ArXiv, vol. abs/2003.02012, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:211988449
  22. Z. Cui, J. Wang, S. Gao, T. Guo, Y. Feng, and B. Bai, “Asymmetric gained deep image compression with continuous rate adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10 532–10 541.
  23. J. Ascenso, E. Alshina, and T. Ebrahimi, “The JPEG AI Standard: Providing Efficient Human and Machine Visual Data Consumption,” IEEE MultiMedia, vol. 30, no. 1, pp. 100–111, 2023.
  24. P. Jia, A. B. Koyuncu, G. Gaikov, A. Karabutov, E. Alshina, and A. Kaup, “Learning-based conditional image coder using color separation,” in 2022 Picture Coding Symposium (PCS), 2022, pp. 49–53.
  25. G. Bjontegaard, “Calculation of average PSNR differences between RD-curves,” VCEG-M33, 2001.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Panqi Jia (6 papers)
  2. Jue Mao (3 papers)
  3. Esin Koyuncu (3 papers)
  4. A. Burakhan Koyuncu (5 papers)
  5. Timofey Solovyev (4 papers)
  6. Alexander Karabutov (4 papers)
  7. Yin Zhao (14 papers)
  8. Elena Alshina (9 papers)
  9. Andre Kaup (11 papers)
Citations (1)