Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Agent Reinforcement Learning with Action Masking for UAV-enabled Mobile Communications (2303.16737v2)

Published 29 Mar 2023 in cs.MA, cs.AI, and cs.LG

Abstract: Unmanned Aerial Vehicles (UAVs) are increasingly used as aerial base stations to provide ad hoc communications infrastructure. Building upon prior research efforts which consider either static nodes, 2D trajectories or single UAV systems, this paper focuses on the use of multiple UAVs for providing wireless communication to mobile users in the absence of terrestrial communications infrastructure. In particular, we jointly optimize UAV 3D trajectory and NOMA power allocation to maximize system throughput. Firstly, a weighted K-means-based clustering algorithm establishes UAV-user associations at regular intervals. The efficacy of training a novel Shared Deep Q-Network (SDQN) with action masking is then explored. Unlike training each UAV separately using DQN, the SDQN reduces training time by using the experiences of multiple UAVs instead of a single agent. We also show that SDQN can be used to train a multi-agent system with differing action spaces. Simulation results confirm that: 1) training a shared DQN outperforms a conventional DQN in terms of maximum system throughput (+20%) and training time (-10%); 2) it can converge for agents with different action spaces, yielding a 9% increase in throughput compared to mutual learning algorithms; and 3) combining NOMA with an SDQN architecture enables the network to achieve a better sum rate compared with existing baseline schemes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications with unmanned aerial vehicles: opportunities and challenges,” IEEE Communications Magazine, vol. 54, no. 5, pp. 36–42, 2016.
  2. N. Zhao, W. Lu, M. Sheng, Y. Chen, J. Tang, F. R. Yu, and K.-K. Wong, “Uav-assisted emergency networks in disasters,” IEEE Wireless Communications, vol. 26, no. 1, pp. 45–51, 2019.
  3. F. Zhou, Y. Wu, R. Q. Hu, Y. Wang, and K. K. Wong, “Energy-efficient noma enabled heterogeneous cloud radio access networks,” IEEE Network, vol. 32, no. 2, pp. 152–160, 2018.
  4. N. Bhushan, J. Li, D. Malladi, R. Gilmore, D. Brenner, A. Damnjanovic, R. T. Sukhavasi, C. Patel, and S. Geirhofer, “Network densification: the dominant theme for wireless evolution into 5g,” IEEE Communications Magazine, vol. 52, no. 2, pp. 82–89, 2014.
  5. S. Li, B. Duo, M. D. Renzo, M. Tao, and X. Yuan, “Robust secure uav communications with the aid of reconfigurable intelligent surfaces,” IEEE Trans. Wirel. Commun., vol. 20, no. 10, pp. 6402–6417, 2021.
  6. Y. Zeng, J. Xu, and R. Zhang, “Energy minimization for wireless communication with rotary-wing uav,” IEEE Trans. Wirel. Commun., vol. 18, no. 4, pp. 2329–2345, 2019.
  7. C. You and R. Zhang, “Hybrid offline-online design for uav-enabled data harvesting in probabilistic los channels,” IEEE Trans. Wirel. Commun., vol. 19, no. 6, pp. 3753–3768, 2020.
  8. S. Zhou, Y. Cheng, X. Lei, and H. Duan, “Multi-agent few-shot meta reinforcement learning for trajectory design and channel selection in uav-assisted networks,” China Communications, 19(4), pp. 166–176, 2022.
  9. M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Optimal transport theory for power-efficient deployment of unmanned aerial vehicles,” in 2016 IEEE International Conference on Communications (ICC), pp. 1-6.
  10. P. K. Sharma and D. I. Kim, “Uav-enabled downlink wireless system with non-orthogonal multiple access,” in 2017 IEEE Globecom Workshops (GC Wkshps), pp. 1–6, 2017.
  11. J. Li, H. Zhao, H. Wang, F. Gu, J. Wei, H. Yin, and B. Ren, “Joint optimization on trajectory, altitude, velocity, and link scheduling for minimum mission time in uav-aided data collection,” IEEE Internet of Things Journal, vol. 7, no. 2, pp. 1464–1475, 2020.
  12. H. Hu, K. Xiong, G. Qu, Q. Ni, P. Fan, and K. B. Letaief, “Aoi-minimal trajectory planning and data collection in uav-assisted wireless powered iot networks,” IEEE Internet Things J,  8(2), pp. 1211–1223, 2021.
  13. X. Liu, J. Wang, N. Zhao, Y. Chen, S. Zhang, Z. Ding, and F. R. Yu, “Placement and power allocation for noma-uav networks,” IEEE Wireless Communications Letters, vol. 8, no. 3, pp. 965–968, 2019.
  14. W. Shi, Y. Sun, M. Liu, H. Xu, G. Gui, T. Ohtsuki, B. Adebisi, H. Gacanin, and F. Adachi, “Joint ul/dl resource allocation for uav-aided full-duplex noma communications,” IEEE Transactions on Communications, vol. 69, no. 12, pp. 8474–8487, 2021.
  15. J. Lyu, Y. Zeng, and R. Zhang, “Spectrum sharing and cyclical multiple access in uav-aided cellular offloading,” in GLOBECOM 2017 - 2017 IEEE Global Communications Conference, pp. 1–6, 2017.
  16. Y. Liu, Z. Qin, M. Elkashlan, Z. Ding, A. Nallanathan, and L. Hanzo, “Nonorthogonal multiple access for 5g and beyond,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2347–2381, 2017.
  17. Y. Sun, D. Xu, D. W. K. Ng, L. Dai, and R. Schober, “Optimal 3d-trajectory design and resource allocation for solar-powered uav communication systems,” IEEE Transactions on Communications, vol. 67, no. 6, pp. 4281–4298, 2019.
  18. H. Tabassum et al., “Non-orthogonal multiple access (noma) in cellular uplink and downlink: Challenges and enabling techniques,” arXiv preprint arXiv:1608.05783, 2016.
  19. M. Zeng, A. Yadav, O. A. Dobre, and H. V. Poor, “Energy-efficient joint user-rb association and power allocation for uplink hybrid noma-oma,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 5119–5131, 2019.
  20. S. Khairy, P. Balaprakash, L. X. Cai, and Y. Cheng, “Constrained deep reinforcement learning for energy sustainable multi-uav based random access iot networks with noma,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 4, pp. 1101–1115, 2021.
  21. Z. Zhang, C. Xu, Z. Li, X. Zhao, and R. Wu, “Deep reinforcement learning for aerial data collection in hybrid-powered noma-iot networks,” IEEE Internet of Things Journal, vol. 10, no. 2, pp. 1761–1774, 2023.
  22. H. V. Nguyen, V.-D. Nguyen, O. A. Dobre, D. N. Nguyen, E. Dutkiewicz, and O.-S. Shin, “Joint power control and user association for noma-based full-duplex systems,” IEEE Transactions on Communications, vol. 67, no. 11, pp. 8037–8055, 2019.
  23. R. Zhong, X. Liu, Y. Liu, and Y. Chen, “Multi-agent reinforcement learning in noma-aided uav networks for cellular offloading,” IEEE Trans. Wirel. Commun., vol. 21, no. 3, pp. 1498–1512, 2022.
  24. T. Naous, M. Itani, M. Awad, and S. Sharafeddine, “Reinforcement learning in the sky: A survey on enabling intelligence in ntn-based communications,” IEEE Access, pp. 1–1, 2023.
  25. B. K. S. Lima, R. Dinis, D. B. da Costa, R. Oliveira, and M. Beko, “User pairing and power allocation for uav-noma systems based on multi-armed bandit framework,” IEEE Transactions on Vehicular Technology, vol. 71, no. 12, pp. 13017–13029, 2022.
  26. M. Z. Hassan, G. Kaddoum, and O. Akhrif, “Interference management in cellular-connected internet of drones networks with drone-pairing and uplink rate-splitting multiple access,” IEEE Internet of Things Journal, vol. 9, no. 17, pp. 16060–16079, 2022.
  27. W. Wang, N. Zhao, L. Chen, X. Liu, Y. Chen, and D. Niyato, “Uav-assisted time-efficient data collection via uplink noma,” IEEE Transactions on Communications, vol. 69, no. 11, pp. 7851–7863, 2021.
  28. P. Chen, J. Zhao, and F. Shen, “Deep reinforcement learning assisted uav trajectory and resource optimization for noma networks,” in 2022 14th International Conference on Wireless Communications and Signal Processing (WCSP), pp. 933–938, 2022.
  29. J. Wang, M. Liu, J. Sun, G. Gui, H. Gacanin, H. Sari, and F. Adachi, “Multiple unmanned-aerial-vehicles deployment and user pairing for nonorthogonal multiple access schemes,” IEEE Internet of Things Journal, vol. 8, no. 3, pp. 1883–1895, 2021.
  30. S. K. Mahmud, Y. Chen, and K. K. Chai, “Ensemble reinforcement learning framework for sum rate optimization in noma-uav network,” in 2022 IEEE World AI IoT Congress (AIIoT), pp. 032–038, 2022.
  31. W. Zhang, Q. Wang, X. Liu, Y. Liu, and Y. Chen, “Three-dimension trajectory design for multi-uav wireless network with deep reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 70, no. 1, pp. 600–612, 2021.
  32. Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application of machine learning in wireless networks: Key techniques and open issues,” IEEE Commun. Surv. Tutor., vol. 21, no. 4, pp. 3072–3108, 2019.
  33. M. Nikooroo and Z. Becvar, “Optimization of transmission power for noma in networks with flying base stations,” in 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall), pp. 1–7, 2020.
  34. J. Baek, S. I. Han, and Y. Han, “Optimal resource allocation for non-orthogonal transmission in uav relay systems,” IEEE Wireless Communications Letters, vol. 7, no. 3, pp. 356–359, 2018.
  35. J. M. Rojas and G. Fraser, “Is search-based unit test generation research stuck in a local optimum?,” in 2017 IEEE/ACM 10th International Workshop on Search-Based Software Testing (SBST), pp. 51–52, 2017.
  36. S. M. Al-Shehri, P. Loskot, T. Numanoğlu, and M. Mert, “Comparing tactical and commercial manets design strategies and performance evaluations,” in MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM), pp. 599–604, 2017.
  37. N. A. Mahiddin, F. F. M. Affandi, and Z. Mohamad, “A review on mobility models in disaster area scenario,” International Journal of Advanced Technology and Engineering Exploration, vol. 8, no. 80, p. 848, 2021.
  38. 3GPP, “3gpp tr 36.777. 3rd generation partnership project; technical specification group radio access network; study on enhanced lte support for aerial vehicles (release 15),” 2017.
  39. J. Lyu, Y. Zeng, and R. Zhang, “Uav-aided offloading for cellular hotspot,” IEEE Transactions on Wireless Communications, vol. 17, no. 6, pp. 3988–4001, 2018.
  40. M. M. Alsmadi, N. A. Ali, M. Hayajneh, and S. S. Ikki, “Down-link noma networks in the presence of iqi and imperfect sic: Receiver design and performance analysis,” IEEE Transactions on Vehicular Technology, vol. 69, no. 6, pp. 6793–6797, 2020.
  41. J. Cui, Y. Liu, Z. Ding, P. Fan, and A. Nallanathan, “Optimal user scheduling and power allocation for millimeter wave noma systems,” IEEE Trans. Wirel. Commun., vol. 17, no. 3, pp. 1502–1517, 2018.
  42. S. Zhang, H. Zhang, B. Di, and L. Song, “Cellular uav-to-x communications: Design and optimization for multi-uav networks,” IEEE Trans. Wirel. Commun., vol. 18, no. 2, pp. 1346–1359, 2019.
  43. S. Javed, A. Hassan, R. Ahmad, W. Ahmed, M. M. Alam, and J. J. Rodrigues, “Uav trajectory planning for disaster scenarios,” Vehicular Communications, p. 100568, 2023.
  44. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015.
  45. L. Liu, B. Tian, X. Zhao, and Q. Zong, “Uav autonomous trajectory planning in target tracking tasks via a dqn approach,” in 2019 IEEE International Conference on Real-time Computing and Robotics (RCAR), pp. 277–282, 2019.
  46. T. O. Eze and M. Ghassemian, “Heterogeneous mobility models scenario: performance analysis of disaster area for mobile ad hoc networks,” in Proceedings of London Communications Symposium (LCS), University College London, 2010.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Danish Rizvi (1 paper)
  2. David Boyle (25 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.