Multi-Agent Deep Reinforcement Learning Based Trajectory Planning for Multi-UAV Assisted Mobile Edge Computing (2009.11277v1)

Published 23 Sep 2020 in eess.SP and cs.LG

Abstract: An unmanned aerial vehicle (UAV)-aided mobile edge computing (MEC) framework is proposed, where several UAVs having different trajectories fly over the target area and support the user equipments (UEs) on the ground. We aim to jointly optimize the geographical fairness among all the UEs, the fairness of each UAV' UE-load and the overall energy consumption of UEs. The above optimization problem includes both integer and continues variables and it is challenging to solve. To address the above problem, a multi-agent deep reinforcement learning based trajectory control algorithm is proposed for managing the trajectory of each UAV independently, where the popular Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method is applied. Given the UAVs' trajectories, a low-complexity approach is introduced for optimizing the offloading decisions of UEs. We show that our proposed solution has considerable performance over other traditional algorithms, both in terms of the fairness for serving UEs, fairness of UE-load at each UAV and energy consumption for all the UEs.

Citations (244)

View on Semantic Scholar

Summary

The paper introduces a comprehensive MADRL algorithm that optimizes UAV trajectories to improve fairness, load balancing, and energy efficiency.
It formulates a complex joint optimization problem involving discrete offloading decisions and continuous trajectory control in multi-UAV systems.
Simulations show that the proposed method outperforms conventional strategies by achieving superior fairness indices and lower energy consumption.

Multi-Agent Deep Reinforcement Learning Based Trajectory Planning for Multi-UAV Assisted Mobile Edge Computing

In this paper, the authors present a novel Unmanned Aerial Vehicle (UAV)-aided Mobile Edge Computing framework focused on enhancing the efficiency and fairness of task offloading in multi-UAV systems. The core problem investigated is the joint optimization of user equipment (UE) fairness, UAV load balancing, and energy efficiency within these networks. This involves handling both integer and continuous variables, posing challenges for conventional optimization techniques. To address this, the paper introduces a trajectory control algorithm using a Multi-Agent Deep Reinforcement Learning (MADRL) approach based on the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method.

Key Results and Techniques

Framework Design:
- The paper proposes a comprehensive multi-UAV framework where UAVs serve as central points for mobile edge computation over areas lacking robust infrastructure. UAVs fly over ground UEs, aiding in task offloading and computation.
Problem Formulation:
- The UAV trajectory optimization is formulated as a problem involving maximizing geographical fairness among UEs, balancing the load across UAVs, and minimizing UE energy consumption. Given the mix of discrete offloading decisions and continuous UAV trajectory adjustments, this forms a complex multi-faceted optimization problem.
Solution via MADRL:
- The authors develop a MADRL-based algorithm that independently manages the trajectories of multiple UAVs. The MADDPG approach is utilized to encourage coordinated decision-making among the UAVs, allowing them to adaptively find optimal routes and offloading strategies under dynamic network conditions.
- Special attention is given to the introduction of prioritized experience replay to improve exploration efficiency and stability during the learning process.
Performance Analysis:
- Simulations demonstrate that the proposed DRL-based algorithm surpasses traditional benchmarks such as random movement strategies (RANDOM) and predefined circular trajectories (CIRCLE) in terms of fairness indices and energy consumption.
- Specifically, the research exhibits a superior fairness index concerning both the distribution of service times among UEs and the balancing of computational loads across UAVs.

Implications and Future Directions

The implications of this research are significant for the design of next-generation UAV-assisted networks incorporating mobile edge computing. The framework not only addresses current limitations in UAV autonomous control and task allocation but also lays groundwork for more adaptive, scalable UAV deployments in dense urban environments or areas affected by disaster. The proposed algorithm's ability to dynamically adapt UAV trajectories to optimize various performance metrics highlights its potential application across diverse sectors from urban infrastructure management to emergency response systems.

Future research may explore extending this framework in several directions:

Incorporation of More Complex UE Dynamics: Considering additional real-world aspects such as moving UEs, real-time channel variations, and heterogeneous task requirements could further enhance system performance and applicability.
Scalability and Robustness: Investigating further into the scalability of MADRL under larger networks may be necessary for deployment feasibility in metropolitan scenarios.
Integration with Advanced Edge Solutions: Combining the trajectory planning with other edge computing paradigms, such as federated learning, might offer collective intelligence capabilities for UAV swarms.

The paper robustly anchors its contributions in UAV trajectory optimization, blending advanced MADRL techniques with practical edge computing needs. It suggests a promising path forward in the field of UAV-assisted edge computing, aligning with ongoing advancements in artificial intelligence and wireless communication technologies.

PDF Markdown

Multi-Agent Deep Reinforcement Learning Based Trajectory Planning for Multi-UAV Assisted Mobile Edge Computing (2009.11277v1)

Summary

Multi-Agent Deep Reinforcement Learning Based Trajectory Planning for Multi-UAV Assisted Mobile Edge Computing

Key Results and Techniques

Implications and Future Directions

Related Papers