UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning (2404.07453v1)
Abstract: In this paper, we investigate an unmanned aerial vehicle (UAV)-assistant air-to-ground communication system, where multiple UAVs form a UAV-enabled virtual antenna array (UVAA) to communicate with remote base stations by utilizing collaborative beamforming. To improve the work efficiency of the UVAA, we formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to simultaneously maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs by optimizing the positions and excitation current weights of all UAVs. This problem is challenging because these two optimization objectives conflict with each other, and they are non-concave to the optimization variables. Moreover, the system is dynamic, and the cooperation among UAVs is complex, making traditional methods take much time to compute the optimization solution for a single task. In addition, as the task changes, the previously obtained solution will become obsolete and invalid. To handle these issues, we leverage the multi-agent deep reinforcement learning (MADRL) to address the UCBMOP. Specifically, we use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB, where three techniques are introduced to enhance the performance. Simulation results demonstrate that the proposed algorithm can learn a better strategy compared with other methods. Moreover, extensive experiments also demonstrate the effectiveness of the proposed techniques.
- Y. Zeng, Q. Wu, and R. Zhang, “Accessing from the sky: A tutorial on UAV communications for 5G and beyond,” Proc. IEEE, vol. 107, no. 12, pp. 2327–2375, 2019.
- N. Zhao, W. Lu, M. Sheng, Y. Chen, J. Tang, F. R. Yu, and K.-K. Wong, “UAV-assisted emergency networks in disasters,” IEEE Wirel. Commun., vol. 26, no. 1, pp. 45–51, 2019.
- S. Chen, J. Zhang, E. Bjornson, J. Zhang, and B. Ai, “Structured massive access for scalable cell-free massive MIMO systems,” IEEE J. Sel. Areas Commun., vol. 39, pp. 1086–1100, Apr. 2021.
- S. Ahmed, M. Z. Chowdhury, and Y. M. Jang, “Energy-efficient uav relaying communications to serve ground nodes,” IEEE Commun. Lett., vol. 24, no. 4, pp. 849–852, 2020.
- M. Khosravi and H. Pishro-Nik, “Unmanned aerial vehicles for package delivery and network coverage,” in Proc. VTC2020-Spring, pp. 1–5, IEEE, 2020.
- Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications with unmanned aerial vehicles: Opportunities and challenges,” IEEE Commun. Mag., vol. 54, no. 5, pp. 36–42, 2016.
- M. Li, L. Liu, Y. Gu, Y. Ding, and L. Wang, “Minimizing energy consumption in wireless rechargeable UAV networks,” IEEE Internet Things J., vol. 9, no. 5, pp. 3522–3532, 2021.
- Y. Zeng, J. Xu, and R. Zhang, “Energy minimization for wireless communication with rotary-wing UAV,” IEEE Trans. Wireless Commun., vol. 18, no. 4, pp. 2329–2345, 2019.
- C. Zhan and Y. Zeng, “Energy minimization for cellular-connected uav: From optimization to deep reinforcement learning,” IEEE Trans. Wireless Commun., 2022.
- M. Mozaffari, W. Saad, M. Bennis, Y.-H. Nam, and M. Debbah, “A tutorial on UAVs for wireless networks: Applications, challenges, and open problems,” IEEE Commun. Surv. Tutorials, vol. 21, no. 3, pp. 2334–2360, 2019.
- S. Liang, Z. Fang, G. Sun, Y. Liu, G. Qu, S. Jayaprakasam, and Y. Zhang, “A joint optimization approach for distributed collaborative beamforming in mobile wireless sensor networks,” Ad Hoc Netw., vol. 106, p. 102216, 2020.
- J. Garza, M. A. Panduro, A. Reyna, G. Romero, and C. d. Rio, “Design of UAVs-based 3D antenna arrays for a maximum performance in terms of directivity and SLL,” Int. J. Antenn. Propag., vol. 2016, 2016.
- S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Proc. ICML, pp. 1587–1596, PMLR, 2018.
- Y. Wang, Z. Gao, J. Zhang, X. Cao, D. Zheng, Y. Gao, D. W. K. Ng, and M. Di Renzo, “Trajectory design for UAV-based internet of things data collection: A deep reinforcement learning approach,” IEEE Internet Things J., vol. 9, no. 5, pp. 3899–3912, 2021.
- H. Xie, D. Yang, L. Xiao, and J. Lyu, “Connectivity-aware 3D UAV path design with deep reinforcement learning,” IEEE Trans. Veh. Technol., vol. 70, no. 12, pp. 13022–13034, 2021.
- J. G. Kuba, R. Chen, M. Wen, Y. Wen, F. Sun, J. Wang, and Y. Yang, “Trust region policy optimisation in multi-agent reinforcement learning,” arXiv preprint arXiv:2109.11251, 2021.
- Y. Cai, Z. Wei, R. Li, D. W. K. Ng, and J. Yuan, “Joint trajectory and resource allocation design for energy-efficient secure uav communication systems,” IEEE Trans. Commun., vol. 68, no. 7, pp. 4536–4553, 2020.
- M. Wang, L. Zhang, P. Gao, X. Yang, K. Wang, and K. Yang, “Stackelberg game-based intelligent offloading incentive mechanism for a multi-UAV-assisted mobile edge computing system,” IEEE Internet Things J., 2023.
- S. F. Abedin, M. S. Munir, N. H. Tran, Z. Han, and C. S. Hong, “Data freshness and energy-efficient UAV navigation optimization: A deep reinforcement learning approach,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 9, pp. 5994–6006, 2020.
- C. H. Liu, X. Ma, X. Gao, and J. Tang, “Distributed energy-efficient multi-uav navigation for long-term communication coverage by deep reinforcement learning,” IEEE Trans. Mob. Comput., vol. 19, no. 6, pp. 1274–1285, 2019.
- M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Communications and control for wireless drone-based antenna array,” IEEE Trans. Commun., vol. 67, no. 1, pp. 820–834, 2018.
- G. Sun, J. Li, Y. Liu, S. Liang, and H. Kang, “Time and energy minimization communications based on collaborative beamforming for UAV networks: A multi-objective optimization method,” IEEE J. Sel. Areas Commun., vol. 39, no. 11, pp. 3555–3572, 2021.
- G. Sun, J. Li, A. Wang, Q. Wu, Z. Sun, and Y. Liu, “Secure and energy-efficient UAV relay communications exploiting collaborative beamforming,” IEEE Trans. Commun., vol. 70, no. 8, pp. 5401–5416, 2022.
- G. Sun, X. Zheng, Z. Sun, Q. Wu, J. Li, Y. Liu, and V. C. Leung, “UAV-enabled secure communications via collaborative beamforming with imperfect eavesdropper information,” IEEE Trans. Mob. Comput., vol. 23, no. 4, pp. 3291–3308, 2024.
- J. Li, G. Sun, L. Duan, and Q. Wu, “Multi-objective optimization for UAV swarm-assisted iot with virtual antenna arrays,” IEEE Trans. Mob. Comput., pp. 1–18, 2024.
- J. Huang, A. Wang, G. Sun, and J. Li, “Jamming-aided maritime physical layer encrypted dual-UAVs communications exploiting collaborative beamforming,” in Proc. CSCWD, IEEE, 2023.
- H. Li, D. Wei, G. Sun, J. Wang, J. Li, and H. Kang, “Interference mitigation via collaborative beamforming in UAV-enabled data collections: A multi-objective optimization method,” 2022.
- S. Krishna Moorthy, N. Mastronarde, S. Pudlewski, E. S. Bentley, and Z. Guan, “Swarm UAV networking with collaborative beamforming and automated ESN learning in the presence of unknown blockages,” Comput. Netw., vol. 231, p. 109804, 2023.
- Y. Zhang, Z. Mou, F. Gao, J. Jiang, R. Ding, and Z. Han, “UAV-enabled secure communications by multi-agent deep reinforcement learning,” IEEE Trans. Veh. Technol., vol. 69, no. 10, pp. 11599–11611, 2020.
- C. Dai, K. Zhu, and E. Hossain, “Multi-agent deep reinforcement learning for joint decoupled user association and trajectory design in full-duplex multi-UAV networks,” IEEE Trans. Mob. Comput., 2022.
- S. Jayaprakasam, S. K. A. Rahim, and C. Y. Leow, “Distributed and collaborative beamforming in wireless sensor networks: Classifications, trends, and research directions,” IEEE Commun. Surv. Tutorials, vol. 19, no. 4, pp. 2092–2116, 2017.
- A. Al-Hourani, S. Kandeepan, and S. Lardner, “Optimal lap altitude for maximum coverage,” IEEE Wireless Commun. Lett., vol. 3, no. 6, pp. 569–572, 2014.
- G. Sun, Y. Liu, Z. Chen, A. Wang, Y. Zhang, D. Tian, and V. C. M. Leung, “Energy efficient collaborative beamforming for reducing sidelobe in wireless sensor networks,” IEEE Trans. Mob. Comput., vol. 20, no. 3, pp. 965–982, 2021.
- S. Jayaprakasam, S. K. Abdul Rahim, C. Y. Leow, and T. O. Ting, “Sidelobe reduction and capacity improvement of open-loop collaborative beamforming in wireless sensor networks,” PloS one, vol. 12, no. 5, p. e0175510, 2017.
- Y. Zeng, X. Xu, and R. Zhang, “Trajectory design for completion time minimization in UAV-enabled multicasting,” IEEE Trans. Wireless Commun., vol. 17, no. 4, pp. 2233–2246, 2018.
- S.-F. Chou, A.-C. Pang, and Y.-J. Yu, “Energy-aware 3D unmanned aerial vehicle deployment for network throughput optimization,” IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 563–578, 2019.
- Z. Yang, W. Xu, and M. Shikh-Bahaei, “Energy efficient UAV communication with energy harvesting,” IEEE Trans. Veh. Technol., vol. 69, no. 2, pp. 1913–1927, 2019.
- C. You and R. Zhang, “Hybrid offline-online design for UAV-enabled data harvesting in probabilistic LoS channels,” IEEE Trans. Wireless Commun., vol. 19, no. 6, pp. 3753–3768, 2020.
- P. Goos, U. Syafitri, B. Sartono, and A. Vazquez, “A nonlinear multidimensional knapsack problem in the optimal design of mixture experiments,” Eur. J. Oper. Res., vol. 281, no. 1, pp. 201–221, 2020.
- A. Feriani and E. Hossain, “Single and multi-agent deep reinforcement learning for ai-enabled wireless networks: A tutorial,” IEEE Commun. Surv. Tutorials, vol. 23, no. 2, pp. 1226–1252, 2021.
- M. L. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Machine learning proceedings 1994, pp. 157–163, Elsevier, 1994.
- C. Yu, A. Velu, E. Vinitsky, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of ppo in cooperative, multi-agent games,” arXiv preprint arXiv:2103.01955, 2021.
- D. Salama, T. K. Sarkar, M. N. Abdallah, X. Yang, and M. Salazar-Palma, “Adaptive processing at multiple frequencies using the same antenna array consisting of dissimilar nonuniformly spaced elements over an imperfectly conducting ground,” IEEE Trans. Antennas Propag., vol. 67, pp. 622–625, Jan. 2019.
- C. S. de Witt, B. Peng, P.-A. Kamienny, P. Torr, W. Böhmer, and S. Whiteson, “Deep multi-agent reinforcement learning for decentralized continuous cooperative control,” arXiv preprint arXiv:2003.06709, 2020.
- M. Samvelyan, T. Rashid, C. S. De Witt, G. Farquhar, N. Nardelli, T. G. Rudner, C.-M. Hung, P. H. Torr, J. Foerster, and S. Whiteson, “The starcraft multi-agent challenge,” arXiv preprint arXiv:1902.04043, 2019.
- S. Kakade and J. Langford, “Approximately optimal approximate reinforcement learning,” in Proc. of ICML, p. 267–274, 2002.
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proc. ICML, pp. 1889–1897, 2015.
- P.-W. Chou, D. Maturana, and S. Scherer, “Improving stochastic policy gradients in continuous control with deep reinforcement learning using the beta distribution,” in Proc. ICML, pp. 834–843, 2017.
- W. Zhou, Z. Cao, N. Deng, K. Jiang, and D. Yang, “Identify, estimate and bound the uncertainty of reinforcement learning for autonomous driving,” IEEE Trans. Intell. Transp. Syst., vol. 24, no. 8, pp. 7932–7942, 2023.
- Springer International Publishing, 2021.
- T. Li, K. Zhu, N. C. Luong, D. Niyato, Q. Wu, Y. Zhang, and B. Chen, “Applications of multi-agent reinforcement learning in future internet: A comprehensive survey,” IEEE Commun. Surv. Tutorials, vol. 24, no. 2, pp. 1240–1279, 2022.
- J. Li, H. Kang, G. Sun, S. Liang, Y. Liu, and Y. Zhang, “Physical layer secure communications based on collaborative beamforming for UAV networks: A multi-objective optimization approach,” in Proc. IEEE INFOCOM 2021, pp. 1–10, IEEE, 2021.
- M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Wireless communication using unmanned aerial vehicles (UAVs): Optimal transport theory for hover time optimization,” IEEE Trans. Wireless Commun., vol. 16, no. 12, pp. 8052–8066, 2017.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” Advances in neural information processing systems, vol. 30, 2017.
- C. S. de Witt, T. Gupta, D. Makoviichuk, V. Makoviychuk, P. H. Torr, M. Sun, and S. Whiteson, “Is independent learning all you need in the starcraft multi-agent challenge?,” arXiv preprint arXiv:2011.09533, 2020.
- L. Bottou et al., “Stochastic gradient learning in neural networks,” Proceedings of Neuro-Nımes, vol. 91, no. 8, p. 12, 1991.
- J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization.,” Journal of machine learning research, vol. 12, no. 7, 2011.
- T. Tieleman, G. Hinton, et al., “Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude,” COURSERA: Neural networks for machine learning, vol. 4, no. 2, pp. 26–31, 2012.
- A. Minturn, D. Vernekar, Y. L. Yang, and H. Sharif, “Distributed beamforming with imperfect phase synchronization for cognitive radio networks,” in Proc. ICC, pp. 4936–4940, IEEE, 2013.
- Y. S. Shmaliy, “Von mises/tikhonov-based distributions for systems with differential phase measurement,” Signal Process., vol. 85, no. 4, pp. 693–703, 2005.
- Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li, and N. Sun, “Rt3d: Real-time 3-d vehicle detection in lidar point cloud for autonomous driving,” IEEE Robot. Autom. Lett., vol. 3, no. 4, pp. 3434–3440, 2018.
- I. Thibault, G. E. Corazza, and L. Deambrogio, “Random, deterministic, and hybrid algorithms for distributed beamforming,” in 2010 5th Advanced Satellite Multimedia Systems Conference and the 11th Signal Processing for Space Communications Workshop, IEEE, 2010.
- J. Feng, Y.-H. Lu, B. Jung, and D. Peroulis, “Energy efficient collaborative beamforming in wireless sensor networks,” in 2009 IEEE International Symposium on Circuits and Systems, IEEE, May 2009.
- F. Quitin, M. M. Ur Rahman, R. Mudumbai, and U. Madhow, “Distributed beamforming with software-defined radios: Frequency synchronization and digital feedback,” in Proc. IEEE GLOBECOM, 2012.