Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Region-Adaptive Transform with Segmentation Prior for Image Compression (2403.00628v4)

Published 1 Mar 2024 in cs.CV and eess.IV

Abstract: Learned Image Compression (LIC) has shown remarkable progress in recent years. Existing works commonly employ CNN-based or self-attention-based modules as transform methods for compression. However, there is no prior research on neural transform that focuses on specific regions. In response, we introduce the class-agnostic segmentation masks (i.e. semantic masks without category labels) for extracting region-adaptive contextual information. Our proposed module, Region-Adaptive Transform, applies adaptive convolutions on different regions guided by the masks. Additionally, we introduce a plug-and-play module named Scale Affine Layer to incorporate rich contexts from various regions. While there have been prior image compression efforts that involve segmentation masks as additional intermediate inputs, our approach differs significantly from them. Our advantages lie in that, to avoid extra bitrate overhead, we treat these masks as privilege information, which is accessible during the model training stage but not required during the inference phase. To the best of our knowledge, we are the first to employ class-agnostic masks as privilege information and achieve superior performance in pixel-fidelity metrics, such as Peak Signal to Noise Ratio (PSNR). The experimental results demonstrate our improvement compared to previously well-performing methods, with about 8.2% bitrate saving compared to VTM-17.0. The source code is available at https://github.com/GityuxiLiu/SegPIC-for-Image-Compression.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Workshop and challenge on learned image compression (clic2020), 2020.
  2. Discrete cosine transform. IEEE transactions on Computers, 100(1):90–93, 1974.
  3. Dsslic: Deep semantic segmentation-based layered image compression. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2042–2046. IEEE, 2019.
  4. End-to-end optimized image compression. arXiv preprint arXiv:1611.01704, 2016.
  5. Variational image compression with a scale hyperprior. In International Conference on Learning Representations, 2018.
  6. Fabrice Bellard. Bpg image format, 2014.
  7. Gisle Bjontegaard. Calculation of average psnr differences between rd-curves. ITU SG16 Doc. VCEG-M33, 2001.
  8. Coco-stuff: Thing and stuff classes in context. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1209–1218, Los Alamitos, CA, USA, 2018. IEEE Computer Society.
  9. Layered conceptual image compression via deep semantic synthesis. In 2019 IEEE International Conference on Image Processing (ICIP), pages 694–698, 2019.
  10. Thousand to one: Semantic prior modeling for conceptual coding. In 2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2021.
  11. Dynamic region-aware convolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8064–8073, 2021.
  12. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7939–7948, 2020.
  13. François Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017.
  14. Ingrid Daubechies. The wavelet transform, time-frequency localization and signal analysis. IEEE transactions on information theory, 36(5):961–1005, 1990.
  15. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
  16. Semantically structured image compression via irregular group-based decoupling. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 17237–17247, 2023.
  17. Generative adversarial nets. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2014.
  18. Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14771–14780, 2021.
  19. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  20. Image compression with encoder-decoder matched semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 160–161, 2020.
  21. Dynamic filter networks. Advances in neural information processing systems, 29, 2016.
  22. Joint Video Experts Team (JVET). Versatile video coding, 2021.
  23. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  24. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
  25. Eastman Kodak. Kodak lossless true color image suite (photocd pcd0992), 1993.
  26. Multi-frequency representation enhancement with privilege information for video super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 12814–12825, 2023.
  27. A unified end-to-end framework for efficient deep image compression. arXiv preprint arXiv:2002.03370, 2020.
  28. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  29. Channel-wise autoregressive entropy models for learned image compression. In 2020 IEEE International Conference on Image Processing (ICIP), pages 3339–3343. IEEE, 2020.
  30. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems, 31, 2018.
  31. Open world entity segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  32. Probability, random processes, and estimation theory for engineers. Prentice-Hall, Inc., 1986.
  33. Semantic structured image coding framework for multiple intelligent applications. IEEE Transactions on Circuits and Systems for Video Technology, 31(9):3631–3642, 2020.
  34. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  35. Dynamic convolutions: Exploiting spatial sparsity for faster inference. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pages 2320–2329, 2020.
  36. Gregory K Wallace. The jpeg still picture compression standard. Communications of the ACM, 34(4):30–44, 1991.
  37. Neural data-dependent transform for learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17379–17388, 2022.
  38. Recovering realistic texture in image super-resolution by deep spatial feature transform. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 606–615, 2018.
  39. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, pages 1398–1402. Ieee, 2003a.
  40. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, pages 1398–1402. Ieee, 2003b.
  41. Transformer-based transform coding. In International Conference on Learning Representations, 2021.
  42. The devil is in the details: Window-based attention for image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17492–17501, 2022.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com