Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MambaJSCC: Deep Joint Source-Channel Coding with Visual State Space Model (2405.03125v1)

Published 6 May 2024 in cs.IT and math.IT

Abstract: Lightweight and efficient deep joint source-channel coding (JSCC) is a key technology for semantic communications. In this paper, we design a novel JSCC scheme named MambaJSCC, which utilizes a visual state space model with channel adaptation (VSSM-CA) block as its backbone for transmitting images over wireless channels. The VSSM-CA block utilizes VSSM to integrate two-dimensional images with the state space, enabling feature extraction and encoding processes to operate with linear complexity. It also incorporates channel state information (CSI) via a newly proposed CSI embedding method. This method deploys a shared CSI encoding module within both the encoder and decoder to encode and inject the CSI into each VSSM-CA block, improving the adaptability of a single model to varying channel conditions. Experimental results show that MambaJSCC not only outperforms Swin Transformer based JSCC (SwinJSCC) but also significantly reduces parameter size, computational overhead, and inference delay (ID). For example, with employing an equal number of the VSSM-CA blocks and the Swin Transformer blocks, MambaJSCC achieves a 0.48 dB gain in peak-signal-to-noise ratio (PSNR) over SwinJSCC while requiring only 53.3% multiply-accumulate operations, 53.8% of the parameters, and 44.9% of ID.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. D. Gündüz, Z. Qin, I. E. Aguerri, H. S. Dhillon, Z. Yang, A. Yener, K. K. Wong, and C.-B. Chae, “Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 5–41, 2023.
  2. Z. Qin, F. Gao, B. Lin, X. Tao, G. Liu, and C. Pan, “A Generalized Semantic Communication System: From Sources to Channels,” IEEE Wireless Communications, vol. 30, no. 3, pp. 18–26, 2023.
  3. J. Liu, S. Shao, W. Zhang, and H. V. Poor, “An Indirect Rate-Distortion Characterization for Semantic Sources: General Model and the Case of Gaussian Observation,” IEEE Transactions on Communications, vol. 70, no. 9, pp. 5946–5959, 2022.
  4. J. Xu, B. Ai, N. Wang, and W. Chen, “Deep Joint Source-Channel Coding for CSI Feedback: An End-to-End Approach,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 260–273, 2023.
  5. T. Wu, Z. Chen, D. He, L. Qian, Y. Xu, M. Tao, and W. Zhang, “CDDM: Channel Denoising Diffusion Models for Wireless Semantic Communications,” IEEE Transactions on Wireless Communications, pp. 1–1, 2024.
  6. S. Yao, K. Niu, S. Wang, and J. Dai, “Semantic Coding for Text Transmission: An Iterative Design,” IEEE Transactions on Cognitive Communications and Networking, vol. 8, no. 4, pp. 1594–1603, 2022.
  7. S. Wang, J. Dai, Z. Liang, K. Niu, Z. Si, C. Dong, X. Qin, and P. Zhang, “Wireless Deep Video Semantic Transmission,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 214–229, 2023.
  8. E. Bourtsoulatze, D. Burth Kurka, and D. Gündüz, “Deep Joint Source-Channel Coding for Wireless Image Transmission,” IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019.
  9. K. Yang, S. Wang, J. Dai, X. Qin, K. Niu, and P. Zhang, “SwinJSCC: Taming Swin Transformer for Deep Joint Source-Channel Coding,” arXiv preprint arXiv:2308.09361, 2023.
  10. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in Proc. IEEE/CVF ICCV, 2021, pp. 9992–10 002.
  11. J. Xu, B. Ai, W. Chen, A. Yang, P. Sun, and M. Rodrigues, “Wireless Image Transmission Using Deep Source Channel Coding With Attention Modules,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 4, pp. 2315–2328, 2022.
  12. Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
  13. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com