Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generalization Bounds for Neural Belief Propagation Decoders (2305.10540v2)

Published 17 May 2023 in cs.IT, cs.LG, and math.IT

Abstract: Machine learning based approaches are being increasingly used for designing decoders for next generation communication systems. One widely used framework is neural belief propagation (NBP), which unfolds the belief propagation (BP) iterations into a deep neural network and the parameters are trained in a data-driven manner. NBP decoders have been shown to improve upon classical decoding algorithms. In this paper, we investigate the generalization capabilities of NBP decoders. Specifically, the generalization gap of a decoder is the difference between empirical and expected bit-error-rate(s). We present new theoretical results which bound this gap and show the dependence on the decoder complexity, in terms of code parameters (blocklength, message length, variable/check node degrees), decoding iterations, and the training dataset size. Results are presented for both regular and irregular parity-check matrices. To the best of our knowledge, this is the first set of theoretical results on generalization performance of neural network based decoders. We present experimental results to show the dependence of generalization gap on the training dataset size, and decoding iterations for different codes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. S. Adiga, X. Xiao, R. Tandon, B. Vasić, and T. Bose, “Generalization bounds for neural belief propagation decoders,” in 2023 IEEE International Symposium on Information Theory (ISIT), pp. 596–601, IEEE, 2023.
  2. X. Li and A. Alkhateeb, “Deep Learning for Direct Hybrid Precoding in Millimeter Wave Massive MIMO Systems,” in 2019 53rd Asilomar Conference on Signals, Systems, and Computers, pp. 800–805, IEEE, 2019.
  3. H. Huang, W. Xia, J. Xiong, J. Yang, G. Zheng, and X. Zhu, “Unsupervised Learning-Based Fast Beamforming Design for Downlink MIMO,” IEEE Access, vol. 7, pp. 7599–7605, 2018.
  4. T. Peken, S. Adiga, R. Tandon, and T. Bose, “Deep Learning for SVD and Hybrid Beamforming,” IEEE Transactions on Wireless Communications, vol. 19, no. 10, pp. 6621–6642, 2020.
  5. E. Nachmani, Y. Be’ery, and D. Burshtein, “Learning to Decode Linear Codes Using Deep Learning,” in 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 341–346, IEEE, 2016.
  6. E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein, and Y. Be’ery, “Deep Learning Methods for Improved Decoding of Linear Codes,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 119–131, 2018.
  7. L. Lugosch and W. J. Gross, “Neural Offset Min-Sum Decoding,” in 2017 IEEE International Symposium on Information Theory (ISIT), pp. 1361–1365, IEEE, 2017.
  8. B. Vasić, X. Xiao, and S. Lin, “Learning to Decode LDPC Codes with Finite-Alphabet Message Passing ,” in 2018 Information Theory and Applications Workshop (ITA), pp. 1–9, IEEE, 2018.
  9. E. Nachmani and L. Wolf, “Hyper-Graph-Network Decoders for Block Codes,” in Advances in Neural Information Processing Systems, pp. 2329–2339, 2019.
  10. N. Doan, S. A. Hashemi, E. N. Mambou, T. Tonnellier, and W. J. Gross, “Neural Belief Propagation Decoding of CRC-Polar Concatenated Codes,” in ICC 2019-2019 IEEE International Conference on Communications (ICC), pp. 1–6, IEEE, 2019.
  11. A. Buchberger, C. Häger, H. D. Pfister, L. Schmalen, and A. G. i Amat, “Pruning and Quantizing Neural Belief Propagation Decoders,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 7, pp. 1957–1966, 2020.
  12. V. G. Satorras and M. Welling, “Neural Enhanced Belief Propagation on Factor Graphs,” in International Conference on Artificial Intelligence and Statistics, pp. 685–693, PMLR, 2021.
  13. E. Nachmani and Y. Be’ery, “Neural Decoding with Optimization of Node Activations,” IEEE Communications Letters, 2022.
  14. T. Gruber, S. Cammerer, J. Hoydis, and S. ten Brink, “On Deep Learning-Based Channel Decoding,” in 2017 51st Annual Conference on Information Sciences and Systems (CISS), pp. 1–6, IEEE, 2017.
  15. N. Shlezinger, Y. C. Eldar, N. Farsad, and A. J. Goldsmith, “ViterbiNet: Symbol Detection Using a Deep Learning Based Viterbi Algorithm,” in 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1–5, IEEE, 2019.
  16. J. Seo, J. Lee, and K. Kim, “Decoding of Polar Code by Using Deep Feed-Forward Neural Networks,” in 2018 International Conference on Computing, Networking and Communications (ICNC), pp. 238–242, IEEE, 2018.
  17. M. Soltani, V. Pourahmadi, A. Mirzaei, and H. Sheikhzadeh, “Deep Learning-Based Channel Estimation,” IEEE Communications Letters, vol. 23, no. 4, pp. 652–655, 2019.
  18. L. Dai, R. Jiao, F. Adachi, H. V. Poor, and L. Hanzo, “Deep Learning for Wireless Communications: An Emerging Interdisciplinary Paradigm,” IEEE Wireless Communications, vol. 27, no. 4, pp. 133–139, 2020.
  19. Cambridge University Press, 2022.
  20. T. Erpek, T. J. O’Shea, Y. E. Sagduyu, Y. Shi, and T. C. Clancy, “Deep Learning for Wireless Communications,” in Development and Analysis of Deep Learning Architectures, pp. 223–266, Springer, 2020.
  21. S. Peng, H. Jiang, H. Wang, H. Alwageed, Y. Zhou, M. M. Sebdani, and Y.-D. Yao, “Modulation Classification Based on Signal Constellation Diagrams and Deep Learning,” IEEE transactions on neural networks and learning systems, vol. 30, no. 3, pp. 718–727, 2018.
  22. X. Liu, D. Yang, and A. El Gamal, “Deep Neural Network Architectures for Modulation Classification,” in 2017 51st Asilomar Conference on Signals, Systems, and Computers, pp. 915–919, IEEE, 2017.
  23. R. Fritschek, R. F. Schaefer, and G. Wunder, “Deep Learning for the Gaussian Wiretap Channel,” in ICC 2019-2019 IEEE International Conference on Communications (ICC), pp. 1–6, IEEE, 2019.
  24. T. O’shea and J. Hoydis, “An Introduction to Deep Learning for the Physical Layer,” IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563–575, 2017.
  25. T. Wang, C.-K. Wen, H. Wang, F. Gao, T. Jiang, and S. Jin, “Deep Learning for Wireless Physical Layer: Opportunities and Challenges,” China Communications, vol. 14, no. 11, pp. 92–111, 2017.
  26. H. Kim, Y. Jiang, S. Kannan, S. Oh, and P. Viswanath, “Deepcode: Feedback Codes via Deep Learning,” in Advances in neural information processing systems, pp. 9436–9446, 2018.
  27. Y. Jiang, H. Kim, H. Asnani, S. Kannan, S. Oh, and P. Viswanath, “Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels,” in Advances in Neural Information Processing Systems, pp. 2758–2768, 2019.
  28. K. Choi, K. Tatwawadi, A. Grover, T. Weissman, and S. Ermon, “Neural Joint Source-Channel Coding,” in International Conference on Machine Learning, pp. 1182–1192, 2019.
  29. H. Tang, J. Xu, S. Lin, and K. A. Abdel-Ghaffar, “Codes on Finite Geometries,” IEEE Transactions on Information Theory, vol. 51, no. 2, pp. 572–596, 2005.
  30. J. Liu and R. C. de Lamare, “Low-Latency Reweighted Belief Propagation Decoding for LDPC Codes,” IEEE Communications Letters, vol. 16, no. 10, pp. 1660–1663, 2012.
  31. S. Zhang and C. Schlegel, “Causes and Dynamics of LDPC Error Floors on AWGN Channels,” in 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1025–1032, IEEE, 2011.
  32. MIT press, 2018.
  33. P. L. Bartlett, N. Harvey, C. Liaw, and A. Mehrabian, “Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks,” The Journal of Machine Learning Research, vol. 20, no. 1, pp. 2285–2301, 2019.
  34. E. D. Sontag et al., “VC Dimension of Neural Networks,” NATO ASI Series F Computer and Systems Sciences, vol. 168, pp. 69–96, 1998.
  35. G. K. Dziugaite and D. M. Roy, “Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data,” arXiv preprint arXiv:1703.11008, 2017.
  36. R. Liao, R. Urtasun, and R. Zemel, “A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks,” arXiv preprint arXiv:2012.07690, 2020.
  37. M. Mohri and A. Rostamizadeh, “Rademacher Complexity Bounds for Non-I.I.D. Processes,” Advances in Neural Information Processing Systems, vol. 21, 2008.
  38. P. L. Bartlett and S. Mendelson, “Rademacher and Gaussian Complexities: Risk Bounds and Structural Results,” Journal of Machine Learning Research, vol. 3, no. Nov, pp. 463–482, 2002.
  39. P. L. Bartlett, D. J. Foster, and M. J. Telgarsky, “Spectrally-normalized margin bounds for neural networks,” Advances in neural information processing systems, vol. 30, 2017.
  40. Y. Mansour, M. Mohri, and A. Rostamizadeh, “Domain Adaptation: Learning Bounds and Algorithms,” arXiv preprint arXiv:0902.3430, 2009.
  41. B. Neyshabur, R. Tomioka, and N. Srebro, “Norm-Based Capacity Control in Neural Networks,” in Conference on Learning Theory, pp. 1376–1401, 2015.
  42. A. Xu and M. Raginsky, “Information-theoretic analysis of generalization capability of learning algorithms,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  43. Y. Bu, S. Zou, and V. V. Veeravalli, “Tightening mutual information-based bounds on generalization error,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 121–130, 2020.
  44. M. J. Kearns and U. Vazirani, An Introduction to Computational Learning Theory. MIT press, 1994.
  45. L. G. Valiant, “A Theory of the Learnable,” Communications of the ACM, vol. 27, no. 11, pp. 1134–1142, 1984.
  46. O. Bousquet, S. Hanneke, S. Moran, R. Van Handel, and A. Yehudayoff, “A Theory of Universal Learning,” in Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp. 532–541, 2021.
  47. J. Langford and R. Caruana, “(Not) Bounding the True Error,” Advances in Neural Information Processing Systems, vol. 14, 2001.
  48. B. Neyshabur, S. Bhojanapalli, D. McAllester, and N. Srebro, “Exploring Generalization in Deep Learning,” Advances in neural information processing systems, vol. 30, 2017.
  49. B. Neyshabur, S. Bhojanapalli, and N. Srebro, “A Pac-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks,” arXiv preprint arXiv:1707.09564, 2017.
  50. N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, “On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima,” arXiv preprint arXiv:1609.04836, 2016.
  51. N. Weinberger, “Generalization bounds and algorithms for learning to communicate over additive noise channels,” IEEE Transactions on Information Theory, vol. 68, no. 3, pp. 1886–1921, 2021.
  52. A. Tsvieli and N. Weinberger, “Learning maximum margin channel decoders,” IEEE Transactions on Information Theory, 2023.
  53. G. K. Dziugaite, A. Drouin, B. Neal, N. Rajkumar, E. Caballero, L. Wang, I. Mitliagkas, and D. M. Roy, “In Search of Robust Measures of Generalization,” Advances in Neural Information Processing Systems, vol. 33, pp. 11723–11733, 2020.
  54. F. Biggs and B. Guedj, “Non-Vacuous Generalisation Bounds for Shallow Neural Networks,” in International Conference on Machine Learning, pp. 1963–1981, PMLR, 2022.
  55. P. Alquier, “User-friendly introduction to PAC-Bayes bounds,” arXiv preprint arXiv:2110.11216, 2021.
  56. G. K. Dziugaite and D. M. Roy, “Data-dependent PAC-Bayes priors via differential privacy,” Advances in neural information processing systems, vol. 31, 2018.
  57. V. Garg, S. Jegelka, and T. Jaakkola, “Generalization and Representational Limits of Graph Neural Networks,” in International Conference on Machine Learning, pp. 3419–3430, PMLR, 2020.
  58. M. Chen, X. Li, and T. Zhao, “On Generalization Bounds of a Family of Recurrent Neural Networks,” arXiv preprint arXiv:1910.12947, 2019.
  59. X.-Y. Hu, E. Eleftheriou, and D.-M. Arnold, “Progressive Edge-Growth Tanner Graphs,” in GLOBECOM’01. IEEE Global Telecommunications Conference (Cat. No. 01CH37270), vol. 2, pp. 995–1001, IEEE, 2001.
  60. X.-Y. Hu, E. Eleftheriou, and D.-M. Arnold, “Regular and Irregular Progressive Edge-Growth Tanner Graphs,” IEEE transactions on information theory, vol. 51, no. 1, pp. 386–398, 2005.
  61. R. Mathias, “The Spectral Norm of a Nonnegative Matrix,” Linear Algebra and its Applications, vol. 139, pp. 269–284, 1990.
  62. M. Ledoux and M. Talagrand, “Probability in Banach spaces. Classics in Mathematics,” 2011.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com