D$^2$-JSCC: Digital Deep Joint Source-channel Coding for Semantic Communications (2403.07338v3)
Abstract: Semantic communications (SemCom) have emerged as a new paradigm for supporting sixth-generation applications, where semantic features of data are transmitted using artificial intelligence algorithms to attain high communication efficiencies. Most existing SemCom techniques utilize deep neural networks (DNNs) to implement analog source-channel mappings, which are incompatible with existing digital communication architectures. To address this issue, this paper proposes a novel framework of digital deep joint source-channel coding (D$2$-JSCC) targeting image transmission in SemCom. The framework features digital source and channel codings that are jointly optimized to reduce the end-to-end (E2E) distortion. First, deep source coding with an adaptive density model is designed to encode semantic features according to their distributions. Second, digital channel coding is employed to protect encoded features against channel distortion. To facilitate their joint design, the E2E distortion is characterized as a function of the source and channel rates via the analysis of the Bayesian model and Lipschitz assumption on the DNNs. Then to minimize the E2E distortion, a two-step algorithm is proposed to control the source-channel rates for a given channel signal-to-noise ratio. Simulation results reveal that the proposed framework outperforms classic deep JSCC and mitigates the cliff and leveling-off effects, which commonly exist for separation-based approaches.
- W. Saad, M. Bennis, and M. Chen, “A vision of 6G wireless systems: Applications, trends, technologies, and open research problems,” IEEE Netw., vol. 34, no. 3, pp. 134–142, June 2019.
- G. Zhu, D. Liu, Y. Du, C. You, J. Zhang, and K. Huang, “Toward an intelligent edge: Wireless communication meets machine learning,” IEEE Commun. Mag., vol. 58, no. 1, pp. 19–25, Jan. 2020.
- Z. Lin, G. Zhu, Y. Deng, X. Chen, Y. Gao, K. Huang, and Y. Fang, “Efficient parallel split learning over resource-constrained wireless edge networks,” Early Access in IEEE Trans. Mob. Comput., 2024.
- K. B. Letaief, W. Chen, Y. Shi, J. Zhang, and Y.-J. A. Zhang, “The roadmap to 6G: Ai empowered wireless networks,” IEEE Commun. Mag., vol. 57, no. 8, pp. 84–90, Aug. 2019.
- Z. Lin, G. Qu, X. Chen, and K. Huang, “Split learning in 6G edge networks,” arXiv preprint arXiv:2306.12194, 2023.
- P. Zhang, W. Xu, H. Gao, K. Niu, X. Xu, X. Qin, C. Yuan, Z. Qin, H. Zhao, J. Wei et al., “Toward wisdom-evolutionary and primitive-concise 6G: A new paradigm of semantic communication networks,” Eng., vol. 8, pp. 60–73, Jan. 2022.
- D. Gündüz, Z. Qin, I. E. Aguerri, H. S. Dhillon, Z. Yang, A. Yener, K. K. Wong, and C.-B. Chae, “Beyond transmitting bits: Context, semantics, and task-oriented communications,” IEEE J. Sel. Areas Commun., vol. 41, no. 1, pp. 5–41, Jan. 2022.
- X. Mu and Y. Liu, “Exploiting semantic communication for non-orthogonal multiple access,” IEEE J. Sel. Areas Commun., vol. 41, no. 8, pp. 2563–2576, Aug. 2023.
- Y. Sun, H. Chen, X. Xu, P. Zhang, and S. Cui, “Semantic knowledge base-enabled zero-shot multi-level feature transmission optimization,” Early Access in IEEE Trans. Wire. Commun., pp. 1–1, 2023.
- Z. Qin, X. Tao, J. Lu, W. Tong, and G. Y. Li, “Semantic communications: Principles and challenges,” arXiv preprint arXiv:2201.01389, 2021.
- T. A. Ramstad, “Shannon mappings for robust communication,” Telektronikk, vol. 98, no. 1, pp. 114–128, 2002.
- P. A. Floor and T. A. Ramstad, “Shannon-kotel’nikov mappings for analog point-to-point communications,” IEEE Trans. Inf. Theory, pp. 1–1, July 2023.
- M. Fresia, F. Perez-Cruz, H. V. Poor, and S. Verdu, “Joint source and channel coding,” IEEE Signal Process. Mag., vol. 27, no. 6, pp. 104–113, Nov. 2010.
- N. Farvardin, “A study of vector quantization for noisy channels,” IEEE Trans. Inf. Theory, vol. 36, no. 4, pp. 799–809, July 1990.
- A. Nosratinia, J. Lu, and B. Aazhang, “Source-channel rate allocation for progressive transmission of images,” IEEE Trans. Commun., vol. 51, no. 2, pp. 186–196, 2003.
- R. Hamzaoui, V. Stankovic, and Z. Xiong, “Optimized error protection of scalable image bit streams [advances in joint source-channel coding for images],” IEEE Signal Process. Mag., vol. 22, no. 6, pp. 91–107, Nov. 2005.
- J. Dai, S. Wang, K. Tan, Z. Si, X. Qin, K. Niu, and P. Zhang, “Nonlinear transform source-channel coding for semantic communications,” IEEE J. Sel. Areas Commun., vol. 40, no. 8, pp. 2300–2316, June 2022.
- E. Bourtsoulatze, D. B. Kurka, and D. Gündüz, “Deep joint source- channel coding for wireless image transmission,” May 2019.
- H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,” IEEE Trans. Signal Process., vol. 69, pp. 2663–2675, Apr. 2021.
- T.-Y. Tung, D. B. Kurka, M. Jankowski, and D. Gündüz, “Deepjscc-Q: Constellation constrained deep joint source-channel coding,” IEEE J. Sel. Areas Inf. Theory, vol. 3, no. 4, pp. 720–731, 2022.
- Y. Bo, Y. Duan, S. Shao, and M. Tao, “Joint coding-modulation for digital semantic communications via variational autoencoder,” arXiv preprint arXiv:2310.06690, 2023.
- Y. He, G. Yu, and Y. Cai, “Rate-adaptive coding mechanism for semantic communications with multi-modal data,” arXiv preprint arXiv:2305.10773, 2023.
- C. Liu, C. Guo, Y. Yang, W. Ni, and T. Q. Quek, “OFDM-based digital semantic communication with importance awareness,” arXiv preprint arXiv:2401.02178, 2024.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
- J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” in Proc. Int. Conf. Learn. Repres. (ICLR), Vancouver, CA, May 2018.
- J. Ballé, V. Laparra, and E. P. Simoncelli, “End-to-end optimized image compression,” in Proc. Int. Conf. Learn. Repres. (ICLR), Toulon, France, Apr. 2017.
- J. Ballé, P. A. Chou, D. Minnen, S. Singh, N. Johnston, E. Agustsson, S. J. Hwang, and G. Toderici, “Nonlinear transform coding,” IEEE J. Sel. Topics Signal Process., vol. 15, no. 2, pp. 339–353, Feb. 2021.
- J. Huang, D. Li, C. Huang, X. Qin, and W. Zhang, “Joint task and data-oriented semantic communications: A deep separate source-channel coding scheme,” IEEE Internet Things J., vol. 11, no. 2, pp. 2255–2272, Jan. 2024.
- Y. Yang and S. Mandt, “Towards empirical sandwich bounds on the rate-distortion function,” in Inter. Conf. on Learn. Represent. (ICLR), Apr. 2022.
- D. Li, J. Huang, C. Huang, X. Qin, H. Zhang, and P. Zhang, “Fundamental limitation of semantic communications: Neural estimation for rate-distortion,” J. Commun. Inf. Net., vol. 8, no. 4, pp. 303–318, Dec. 2023.
- I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Commun. ACM, vol. 30, no. 6, pp. 520–540, 1987.
- Y. Polyanskiy, H. V. Poor, and S. Verdú, “Channel coding rate in the finite blocklength regime,” IEEE Transactions on Information Theory, vol. 56, no. 5, pp. 2307–2359, 2010.
- E. Arikan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” IEEE Trans. inf. Theory, vol. 55, no. 7, pp. 3051–3073, July 2009.
- J. Liu, H. Sun, and J. Katto, “Learned image compression with mixed transformer-cnn architectures,” in in Proc. Conf. Comput. Vis. Pattern Recog. (CVPR), Vancouver, CA, June 2023, pp. 14 388–14 397.
- A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, A. Kolesnikov et al., “The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale,” Inter. J. of Comput. Vis., vol. 128, no. 7, pp. 1956–1981, 2020.
- T. Oshea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–575, Dec. 2017.
- E. Frank, B. Pfahringer, and M. J. Cree, “Regularisation of neural networks by enforcing lipschitz continuity,” Mach. Learn., vol. 110, no. 2, pp. 393–416, Dec. 2020.
- G. C. Cawley and N. L. Talbot, “On over-fitting in model selection and subsequent selection bias in performance evaluation,” J. Mach. Learn. Research, vol. 11, pp. 2079–2107, July 2010.
- “Kodak photocd dataset,” URL: http://r0k.us/graphics/kodak/, 1993.
- “Clic 2021: Challenge on learned image compression,” URL: http://compression.cc, 2021.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” Advances neural inf. process. sys., vol. 32, 2019.
- A. Cassagne, O. Hartmann, M. Léonardon, K. He, C. Leroux, R. Tajan, O. Aumage, D. Barthou, T. Tonnellier, V. Pignoly, B. Le Gal, and C. Jégo, “Aff3ct: A fast forward error correction toolbox!” Elsevier SoftwareX, vol. 10, p. 100345, Oct. 2019. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S2352711019300457
- Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2. Ieee, 2003, pp. 1398–1402.
- Jianhao Huang (9 papers)
- Kai Yuan (35 papers)
- Chuan Huang (49 papers)
- Kaibin Huang (186 papers)