Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Semantic-Preserving Image Coding based on Conditional Diffusion Models (2310.15737v2)

Published 24 Oct 2023 in cs.IT and math.IT

Abstract: Semantic communication, rather than on a bit-by-bit recovery of the transmitted messages, focuses on the meaning and the goal of the communication itself. In this paper, we propose a novel semantic image coding scheme that preserves the semantic content of an image, while ensuring a good trade-off between coding rate and image quality. The proposed Semantic-Preserving Image Coding based on Conditional Diffusion Models (SPIC) transmitter encodes a Semantic Segmentation Map (SSM) and a low-resolution version of the image to be transmitted. The receiver then reconstructs a high-resolution image using a Denoising Diffusion Probabilistic Models (DDPM) doubly conditioned to the SSM and the low-resolution image. As shown by the numerical examples, compared to state-of-the-art (SOTA) approaches, the proposed SPIC exhibits a better balance between the conventional rate-distortion trade-off and the preservation of semantically-relevant features. Code available at https://github.com/frapez1/SPIC

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. “Semantic communications: Overview, open issues, and future research directions,” IEEE Wireless Communications, vol. 29, no. 1, pp. 210–219, 2022.
  2. “Semantic communications based on adaptive generative models and information bottleneck,” IEEE Communications Magazine, vol. 61, no. 11, pp. 36–41, Nov 2023.
  3. “Deep joint source-channel coding for semantic communications,” ArXiv, vol. abs/2211.08747, 2022.
  4. “Dsslic: Deep semantic segmentation-based layered image compression,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 2042–2046.
  5. “Image-to-image translation with conditional adversarial networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5967–5976.
  6. “High-resolution image synthesis and semantic manipulation with conditional gans,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  7. “Generative adversarial nets,” in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger, Eds. 2014, vol. 27, Curran Associates, Inc.
  8. “Denoising diffusion probabilistic models,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, Eds. 2020, vol. 33, pp. 6840–6851, Curran Associates, Inc.
  9. “Diffusion models beat GANs on image synthesis,” in Advances in Neural Information Processing Systems, A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, Eds., 2021.
  10. “Denoising diffusion restoration models,” in ICLR Workshop on Deep Generative Models for Highly Structured Data, 2022.
  11. “GLIDE: towards photorealistic image generation and editing with text-guided diffusion models,” in International Conference on Machine Learning, ICML 2022. 2022, vol. 162 of Proceedings of Machine Learning Research, pp. 16784–16804, PMLR.
  12. “Semantic image synthesis via diffusion models,” arXiv:2207.00050, 2022.
  13. “Semantic image synthesis with spatially-adaptive normalization,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2332–2341.
  14. “Internimage: Exploring large-scale vision foundation models with deformable convolutions,” arXiv preprint arXiv:2211.05778, 2022.
  15. Fabrice Bellard, “Better portable graphics image format,” (http://bellard.org/bpg/), 2017.
  16. “Flif: Free lossless image format based on maniac compression,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 66–70.
  17. U-Net: Convolutional Networks for Biomedical Image Segmentation, pp. 234–241, Springer International Publishing, Cham, 2015.
  18. “Diffusion autoencoders: Toward a meaningful and decodable representation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  19. “Classifier-free diffusion guidance,” in NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021.
  20. “Image super-resolution via iterative refinement,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4713–4726, 2023.
  21. “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
  22. “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10684–10695.
  23. “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2017, NIPS’17, p. 6629–6640, Curran Associates Inc.
  24. “Rethinking the inception architecture for computer vision,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826, 2015.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com