Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AICT: An Adaptive Image Compression Transformer (2307.06091v1)

Published 12 Jul 2023 in cs.CV and eess.IV

Abstract: Motivated by the efficiency investigation of the Tranformer-based transform coding framework, namely SwinT-ChARM, we propose to enhance the latter, as first, with a more straightforward yet effective Tranformer-based channel-wise auto-regressive prior model, resulting in an absolute image compression transformer (ICT). Current methods that still rely on ConvNet-based entropy coding are limited in long-range modeling dependencies due to their local connectivity and an increasing number of architectural biases and priors. On the contrary, the proposed ICT can capture both global and local contexts from the latent representations and better parameterize the distribution of the quantized latents. Further, we leverage a learnable scaling module with a sandwich ConvNeXt-based pre/post-processor to accurately extract more compact latent representation while reconstructing higher-quality images. Extensive experimental results on benchmark datasets showed that the proposed adaptive image compression transformer (AICT) framework significantly improves the trade-off between coding efficiency and decoder complexity over the versatile video coding (VVC) reference encoder (VTM-18.0) and the neural codec SwinT-ChARM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. “Variable rate image compression with recurrent neural networks,” arXiv preprint arXiv:1511.06085, 2015.
  2. “Joint autoregressive and hierarchical priors for learned image compression,” Advances in neural information processing systems, vol. 31, 2018.
  3. “Extended end-to-end optimized image compression method based on a context-adaptive entropy model.,” in CVPR Workshops, 2019, p. 0.
  4. “Conditional probability models for deep image compression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4394–4402.
  5. “Variational image compression with a scale hyperprior,” arXiv preprint arXiv:1802.01436, 2018.
  6. “Coarse-to-fine hyper-prior modeling for learned image compression,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, pp. 11013–11020.
  7. “Image-dependent local entropy models for learned image compression,” in 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018, pp. 430–434.
  8. “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7939–7948.
  9. “Learning context-based nonlocal entropy modeling for image compression,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
  10. “Learning accurate entropy model with global reference for image compression,” arXiv preprint arXiv:2010.08321, 2020.
  11. “End-to-end learnt image compression via non-local attention optimization and improved context modeling,” IEEE Transactions on Image Processing, vol. 30, pp. 3179–3191, 2021.
  12. “High-fidelity generative image compression,” June 2 2022, US Patent App. 17/107,684.
  13. “Channel-wise autoregressive entropy models for learned image compression,” in 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2020, pp. 3339–3343.
  14. “Transformer-based transform coding,” in International Conference on Learning Representations, 2021.
  15. “The devil is in the details: Window-based attention for image compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17492–17501.
  16. “Contextformer: A transformer with spatio-channel attention for context modeling in learned image compression,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIX. Springer, 2022, pp. 447–463.
  17. “Unified multivariate gaussian mixture for efficient neural image compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17612–17621.
  18. “Joint global and local hierarchical priors for learned image compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5992–6001.
  19. “Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5718–5727.
  20. “Spatial transformer networks,” Advances in neural information processing systems, vol. 28, 2015.
  21. “Estimating the resize parameter in end-to-end learned image compression,” arXiv preprint arXiv:2204.12022, 2022.
  22. “Learning to resize images for computer vision tasks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 497–506.
  23. “Learning to downsample for segmentation of ultra-high resolution images,” arXiv preprint arXiv:2109.11071, 2021.
  24. “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
  25. “A convnet for the 2020s,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976–11986.
  26. Benchmark datasets, “kodak testing set: http://r0k.us/graphics, tecnick testing set: https://testimages.org/, jpeg-ai testing set: https://jpegai.github.io/test_images/, and clic21 testing set: http://compression.cc/tasks/,” .
Citations (3)

Summary

We haven't generated a summary for this paper yet.