Multi-Agent Reinforcement Learning for Multi-Cell Spectrum and Power Allocation (2312.05746v2)
Abstract: This paper introduces a novel approach to radio resource allocation in multi-cell wireless networks using a fully scalable multi-agent reinforcement learning (MARL) framework. A distributed method is developed where agents control individual cells and determine spectrum and power allocation based on limited local information, yet achieve quality of service (QoS) performance comparable to centralized methods using global information. The objective is to minimize packet delays across devices under stochastic arrivals and applies to both conflict graph abstractions and cellular network configurations. This is formulated as a distributed learning problem, implementing a multi-agent proximal policy optimization (MAPPO) algorithm with recurrent neural networks and queueing dynamics. This traffic-driven MARL-based solution enables decentralized training and execution, ensuring scalability to large networks. Extensive simulations demonstrate that the proposed methods achieve comparable QoS performance to genie-aided centralized algorithms with significantly less execution time. The trained policies also exhibit scalability and robustness across various network sizes and traffic conditions.
- A. A. Al-Habob, O. A. Dobre, A. G. Armada, and S. Muhaidat, “Task scheduling for mobile edge computing using genetic algorithm and conflict graphs,” IEEE Transactions on Vehicular Technology, vol. 69, no. 8, pp. 8805–8819, 2020.
- Z. Zhao, G. Verma, C. Rao, A. Swami, and S. Segarra, “Distributed scheduling using graph neural networks,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 4720–4724.
- R. E. Tarjan and A. E. Trojanowski, “Finding a maximum independent set,” SIAM Journal on Computing, vol. 6, no. 3, pp. 537–546, 1977.
- J. Ni, B. Tan, and R. Srikant, “Q-CSMA: Queue-length-based CSMA/CA algorithms for achieving maximum throughput and low delay in wireless networks,” IEEE/ACM Transactions on Networking, vol. 20, no. 3, pp. 825–836, 2011.
- K. Shen and W. Yu, “Fractional programming for communication systems—part i: Power control and beamforming,” IEEE Transactions on Signal Processing, vol. 66, no. 10, pp. 2616–2630, 2018.
- K. Yang, D. Li, C. Shen, J. Yang, S.-p. Yeh, and J. Sydir, “Multi-agent reinforcement learning for wireless user scheduling: Performance, scalablility, and generalization,” in 2022 56th Asilomar Conference on Signals, Systems, and Computers. IEEE, 2022, pp. 1169–1174.
- O. Orhan, V. N. Swamy, M. Rahman, H. Nikopour, and S. Talwar, “Graph neural networks to enable scalable mac for massive mimo wireless infrastructure,” in 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 2023, pp. 489–494.
- Y. S. Nasir and D. Guo, “Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 10, pp. 2239–2250, 2019.
- C. Claus and C. Boutilier, “The dynamics of reinforcement learning in cooperative multiagent systems,” AAAI/IAAI, vol. 1998, no. 746-752, p. 2, 1998.
- L. Matignon, G. J. Laurent, and N. Le Fort-Piat, “Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems,” The Knowledge Engineering Review, vol. 27, no. 1, pp. 1–31, 2012.
- K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Basar, “Fully decentralized multi-agent reinforcement learning with networked agents,” in International Conference on Machine Learning. PMLR, 2018, pp. 5872–5881.
- R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” Advances in neural information processing systems, vol. 30, 2017.
- Y. Zhang and D. Guo, “Distributed MARL for scheduling in conflict graphs,” in 2023 59th annual Allerton Conference on communication, controal and computing. IEEE, 2023.
- G. Qu, A. Wierman, and N. Li, “Scalable reinforcement learning of localized policies for multi-agent networked systems,” in Learning for Dynamics and Control. PMLR, 2020, pp. 256–266.
- H. Wei, X. Liu, W. Wang, and L. Ying, “Sample efficient reinforcement learning in mixed systems through augmented samples and its applications to queueing networks,” arXiv preprint arXiv:2305.16483, 2023.
- C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of ppo in cooperative multi-agent games,” Advances in Neural Information Processing Systems, vol. 35, pp. 24 611–24 624, 2022.