Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Generative Model Assisted 3D Semantic Communication (2403.05783v1)

Published 9 Mar 2024 in cs.IT, cs.LG, and math.IT

Abstract: Semantic Communication (SC) is a novel paradigm for data transmission in 6G. However, there are several challenges posed when performing SC in 3D scenarios: 1) 3D semantic extraction; 2) Latent semantic redundancy; and 3) Uncertain channel estimation. To address these issues, we propose a Generative AI Model assisted 3D SC (GAM-3DSC) system. Firstly, we introduce a 3D Semantic Extractor (3DSE), which employs generative AI models, including Segment Anything Model (SAM) and Neural Radiance Field (NeRF), to extract key semantics from a 3D scenario based on user requirements. The extracted 3D semantics are represented as multi-perspective images of the goal-oriented 3D object. Then, we present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images, in which we use a semantic encoder with two output heads to perform semantic encoding and mask redundant semantics in the latent semantic space, respectively. Next, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels. Finally, simulation results demonstrate the advantages of the proposed GAM-3DSC system in effectively transmitting the goal-oriented 3D scenario.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. M. A. Uusitalo, P. Rugeland, M. R. Boldi, E. C. Strinati, P. Demestichas, M. Ericson, G. P. Fettweis, M. C. Filippou, A. Gati, M.-H. Hamon et al., “6G vision, value, use cases and technologies from european 6G flagship project hexa-x,” IEEE Access, vol. 9, pp. 160 004–160 020, 2021.
  2. W. Yang, H. Du, Z. Q. Liew, W. Y. B. Lim, Z. Xiong, D. Niyato, X. Chi, X. S. Shen, and C. Miao, “Semantic communications for future internet: Fundamentals, applications, and challenges,” IEEE Communications Surveys & Tutorials, 2022.
  3. Z. Qin, X. Tao, J. Lu, and G. Y. Li, “Semantic communications: Principles and challenges,” arXiv preprint arXiv:2201.01389, 2021.
  4. Y. Huang, Y. Zhu, X. Qiao, X. Su, S. Dustdar, and P. Zhang, “Towards holographic video communications: A promising ai-driven solution,” IEEE Communications Magazine, 2022.
  5. S. Iyer, R. Khanai, D. Torse, R. J. Pandya, K. M. Rabie, K. Pai, W. U. Khan, and Z. Fadlullah, “A survey on semantic communications for intelligent wireless networks,” Wireless Personal Communications, pp. 1–43, 2022.
  6. J. Wang, H. Du, Z. Tian, D. Niyato, J. Kang et al., “Semantic-aware sensing information transmission for metaverse: A contest theoretic approach,” arXiv preprint arXiv:2211.12783, 2022.
  7. C. Xiao, Y. R. Zheng, and N. C. Beaulieu, “Novel sum-of-sinusoids simulation models for rayleigh and rician fading channels,” IEEE Transactions on Wireless Communications, vol. 5, no. 12, pp. 3667–3679, 2006.
  8. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
  9. L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, Y. Shao, W. Zhang, B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey of methods and applications,” arXiv preprint arXiv:2209.00796, 2022.
  10. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
  11. A. Taloni, V. Scorcia, and G. Giannaccare, “Modern threats in academia: evaluating plagiarism and artificial intelligence detection scores of chatgpt,” Eye, pp. 1–4, 2023.
  12. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
  13. F. Jiang, Y. Peng, L. Dong, K. Wang, K. Yang, C. Pan, and X. You, “Large AI model-based semantic communications,” arXiv preprint arXiv:2307.03492, 2023.
  14. F. Jiang, L. Dong, Y. Peng, K. Wang, K. Yang, C. Pan, D. Niyato, and O. A. Dobre, “Large language model enhanced multi-agent systems for 6G communications,” arXiv preprint arXiv:2312.07850, 2023.
  15. Y.-C. Guo, D. Kang, L. Bao, Y. He, and S.-H. Zhang, “Nerfren: Neural radiance fields with reflections,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   IEEE, 2022, pp. 18 388–18 397.
  16. D. Maggio, M. Abate, J. Shi, C. Mario, and L. Carlone, “Loc-NeRF: Monte carlo localization using neural radiance fields,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 4018–4025.
  17. J. Ma and B. Wang, “Segment anything in medical images,” arXiv preprint arXiv:2304.12306, 2023.
  18. T. Yu, R. Feng, R. Feng, J. Liu, X. Jin, W. Zeng, and Z. Chen, “Inpaint anything: Segment anything meets image inpainting,” arXiv preprint arXiv:2304.06790, 2023.
  19. L. Tang, H. Xiao, and B. Li, “Can SAM segment anything? when SAM meets camouflaged object detection,” arXiv preprint arXiv:2304.04709, 2023.
  20. H. Zhang, V. Sindagi, and V. M. Patel, “Image de-raining using a conditional generative adversarial network,” IEEE transactions on circuits and systems for video technology, vol. 30, no. 11, pp. 3943–3956, 2019.
  21. Q. Zhang, A. Ferdowsi, W. Saad, and M. Bennis, “Distributed conditional generative adversarial networks (GANs) for data-driven millimeter wave communications in uav networks,” IEEE Transactions on Wireless Communications, vol. 21, no. 3, pp. 1438–1452, 2021.
  22. B. Banerjee, R. C. Elliott, W. A. Krzymień, and H. Farmanbar, “Downlink channel estimation for FDD massive MIMO using conditional generative adversarial networks,” IEEE Transactions on Wireless Communications, vol. 22, no. 1, pp. 122–137, 2023.
  23. H. Tang, Y. Zhao, G. Wang, C. Luo, and W. Wang, “Wireless signal denoising using conditional generative adversarial networks,” in IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 2023, pp. 1–6.
  24. C. Saharia, W. Chan, H. Chang, C. Lee, J. Ho, T. Salimans, D. Fleet, and M. Norouzi, “Palette: Image-to-image diffusion models,” in ACM SIGGRAPH 2022 Conference Proceedings, 2022, pp. 1–10.
  25. J. Ho, T. Salimans, A. Gritsenko, W. Chan, M. Norouzi, and D. J. Fleet, “Video diffusion models,” arXiv preprint arXiv:2204.03458, 2022.
  26. M. Xu, L. Yu, Y. Song, C. Shi, S. Ermon, and J. Tang, “Geodiff: A geometric diffusion model for molecular conformation generation,” arXiv preprint arXiv:2203.02923, 2022.
  27. S. Park, O. Simeone, and J. Kang, “End-to-end fast training of communication links without a channel model via online meta-learning,” in 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).   IEEE, 2020, pp. 1–5.
  28. Z. Zhang, “Improved adam optimizer for deep neural networks,” in 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS).   Ieee, 2018, pp. 1–2.
  29. F. Jiang, Y. Peng, L. Dong, K. Wang, K. Yang, C. Pan, and X. You, “Large AI model empowered multimodal semantic communications,” arXiv preprint arXiv:2309.01249, 2023.
  30. J. Cen, Z. Zhou, J. Fang, W. Shen, L. Xie, X. Zhang, and Q. Tian, “Segment anything in 3d with NeRFs,” arXiv preprint arXiv:2304.12308, 2023.
  31. S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in vision: A survey,” ACM computing surveys (CSUR), vol. 54, no. 10s, pp. 1–41, 2022.
  32. Z. Zhou, S. Zheng, J. Chen, Z. Zhao, and X. Yang, “Speech semantic communication based on swin transformer,” IEEE Transactions on Cognitive Communications and Networking, 2023.
  33. H. Xie and Z. Qin, “A lite distributed semantic communication system for internet of things,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 142–153, 2020.
  34. Y. Dong, H. Wang, and Y.-D. Yao, “Channel estimation for one-bit multiuser massive MIMO using conditional GAN,” IEEE Communications Letters, vol. 25, no. 3, pp. 854–858, 2020.
  35. X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley, “Least squares generative adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2794–2802.
  36. H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1520–1528.
  37. J. Li, D. Li, C. Xiong, and S. Hoi, “BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” in International Conference on Machine Learning.   PMLR, 2022, pp. 12 888–12 900.
  38. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  39. K. Mrinalini, P. Vijayalakshmi, and T. Nagarajan, “Sbsim: A sentence-bert similarity-based evaluation metric for indian language neural machine translation systems,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1396–1406, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Feibo Jiang (24 papers)
  2. Yubo Peng (15 papers)
  3. Li Dong (154 papers)
  4. Kezhi Wang (106 papers)
  5. Kun Yang (227 papers)
  6. Cunhua Pan (210 papers)
  7. Xiaohu You (177 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.