Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rate Splitting Multiple Access-Enabled Adaptive Panoramic Video Semantic Transmission (2402.16581v2)

Published 26 Feb 2024 in eess.IV

Abstract: In this paper, we propose an adaptive panoramic video semantic transmission (APVST) framework enabled by rate splitting multiple access (RSMA). The APVST framework consists of a semantic transmitter and receiver, utilizing a deep joint source-channel coding structure to adaptively extract and encode semantic features from panoramic frames. To achieve higher spectral efficiency and conserve bandwidth, APVST employs an entropy model and a dimension-adaptive module to control the transmission rate. Additionally, we take weighted-to-spherically-uniform peak signal-to-noise ratio (WS-PSNR) and weighted-to-spherically-uniform structural similarity (WS-SSIM) as distortion evaluation metrics for panoramic videos and design a weighted self-attention module for APVST. This module integrates weights and feature maps to enhance the quality of the immersive experience. Considering the overlap in the field of view when users watch panoramic videos, we further utilize RSMA to split the required panoramic video semantic streams into common and private messages for transmission. We propose an RSMA-enabled semantic stream transmission scheme and formulate a joint problem of latency and immersive experience quality by optimizing the allocation ratios of power, common rate, and channel bandwidth, aiming to maximize the quality of service (QoS) scores for users. To address the above problem, we propose a deep reinforcement learning algorithm based on proximal policy optimization (PPO) with high efficiency to handle dynamically changing environments. Simulation results demonstrate that our proposed APVST framework saves up to 20% and 50% of channel bandwidth compared to other semantic and traditional video transmission schemes, respectively. Moreover, our study confirms the efficiency of RSMA in panoramic video transmission, achieving performance gains of 13% and 20% compared to NOMA and OFDMA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. A. Yazar, S. Doğan-Tusha, and H. Arslan, “6G vision: an ultra-flexible perspective,” arXiv preprint, arXiv.2009.07597, 2020.
  2. Y. Sun, Z. Chen, M. Tao, and H. Liu, “Communications, caching, and computing for mobile virtual reality: modeling and tradeoff,” IEEE Trans. Commun., vol. 67, no. 11, pp. 7573-7586, Nov. 2019.
  3. M. Zink, R. Sitaraman and K. Nahrstedt, “Scalable 360° video stream delivery: challenges, solutions, and opportunities,” Proc. IEEE, vol. 107, no. 4, pp. 639-650, April 2019.
  4. Y. Li, X. Qin, K. Han, N. Ma, X. Xu and P. Zhang, “Accelerating wireless federated learning with adaptive scheduling over heterogeneous devices,” IEEE Internet Things J., vol. 11, no. 2, pp. 2286-2302, Jan. 2024.
  5. B. Xu, R. Meng, Y. Chen, X. Xu, C. Dong, and H. Sun, “Latent semantic diffusion-based channel adaptive De-Noising SemCom for future 6G systems,” arXiv preprint, arXiv.2304.09420, 2023.
  6. J. Li, B. Li, and Y. Lu, “Deep contextual video compression,” arXiv preprint, arXiv.2109.15047, 2021.
  7. M. Li, J. Li, S. Gu, F. Wu, and D. Zhang, “End-to-end optimized 360° image compression,” IEEE Trans. Image Process., vol. 31, pp. 6267-6281, Sept. 2022.
  8. Y. Sun, A. Lu, and L. Yu, “Weighted-to-spherically-uniform quality evaluation for omnidirectional video,” IEEE Signal Process. Lett., vol. 24, no. 9, pp. 1408-1412, Sept. 2017.
  9. Y. Zhou, M. Yu, H. Ma, H. Shao and G. Jiang, “Weighted-to-spherically-uniform SSIM objective quality evaluation for panoramic video,” 2018 14th IEEE International Conference on Signal Processing (ICSP), Beijing, China, pp. 54-57, 2018.
  10. S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: convolutional block attention module,” arXiv preprint, arXiv.1807.06521, 2018.
  11. J. Ballé, V. Laparra, and E. P. Simoncelli, “End-to-end optimized image compression,” arXiv preprint, arXiv.1611.01704, 2016.
  12. E. Agustsson, D. Minnen, N. Johnston, J. Ballé, S. J. Hwang, and G. Toderici, “Scale-space flow for end-to-end optimized video compression,” IEEE Conf. Comput. Vis. Pattern Recognit., pp. 8500-8509, 2020.
  13. J. Ballé, V. Laparra, and E. P. Simoncelli, “Density modeling of images using a generalized normalization transformation,” arXiv preprint, arXiv.1511.06281, 2015.
  14. A. A. Baniya, T. -K. Lee, P. W. Eklund, and S. Aryal, “Omnidirectional video super-resolution using deep learning,” IEEE Trans. Multimedia, pp. 1-15, Apr. 2023.
  15. D. B. Kurka and D. Gündüz, “DeepJSCC-f: Deep joint source-channel coding of images with feedback,” IEEE J. Sel. Areas Inf. Theory, vol. 1, no. 1, pp. 178-193, May 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Haixiao Gao (5 papers)
  2. Mengying Sun (14 papers)
  3. Xiaodong Xu (268 papers)
  4. Shujun Han (7 papers)
  5. Bizhu Wang (10 papers)
  6. Jingxuan Zhang (31 papers)
  7. Ping Zhang (437 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com