Path Planning for UAV-Mounted Mobile Edge Computing with Deep Reinforcement Learning (2001.10268v1)

Published 28 Jan 2020 in cs.IT, eess.SP, and math.IT

Abstract: In this letter, we study an unmanned aerial vehicle (UAV)-mounted mobile edge computing network, where the UAV executes computational tasks offloaded from mobile terminal users (TUs) and the motion of each TU follows a Gauss-Markov random model. To ensure the quality-of-service (QoS) of each TU, the UAV with limited energy dynamically plans its trajectory according to the locations of mobile TUs. Towards this end, we formulate the problem as a Markov decision process, wherein the UAV trajectory and UAV-TU association are modeled as the parameters to be optimized. To maximize the system reward and meet the QoS constraint, we develop a QoS-based action selection policy in the proposed algorithm based on double deep Q-network. Simulations show that the proposed algorithm converges more quickly and achieves a higher sum throughput than conventional algorithms.

Citations (176)

View on Semantic Scholar

Summary

The paper introduces a novel DRL framework using DDQN to formulate the UAV trajectory planning problem as a Markov Decision Process.
The paper leverages a QoS-based ε-greedy policy with DDQN to mitigate overestimation and secure a 99% QoS guarantee for terminal users.
The paper demonstrates that dynamic DRL-based trajectory planning notably improves throughput and energy efficiency in UAV-assisted MEC systems.

Path Planning for UAV-Mounted Mobile Edge Computing with Deep Reinforcement Learning

The paper "Path Planning for UAV-Mounted Mobile Edge Computing with Deep Reinforcement Learning" presents a novel approach to optimizing the trajectory of Unmanned Aerial Vehicles (UAVs) used in mobile edge computing (MEC) networks. The focus is on enhancing computational efficiency by dynamically managing UAV trajectories in response to varying locations of mobile terminal users (TUs). The paper leverages Deep Reinforcement Learning (DRL) methodologies, specifically the Double Deep Q-Network (DDQN), to address the inherent challenges of managing large state-action spaces induced by dynamic TU trajectories.

The authors propose an optimization framework where the UAV trajectory problem is formulated as a Markov Decision Process (MDP). This framework is essential in modeling the problem accurately, given the stochastic nature of TU mobility modeled by the Gauss-Markov random model (GMRM). The research aims to optimize the UAV's trajectory to maximize the system's reward while adhering to quality-of-service (QoS) constraints and accounting for UAV energy limitations.

This investigation is notable for employing a DDQN approach, which mitigates overestimation issues found in traditional Deep Q-Network (DQN) models. The paper presents a DDQN structure alongside a proposed QoS-based $\epsilon$ -greedy policy, enhancing the selection of optimal actions by the UAV for improving throughput and maintaining QoS guarantees for each TU. Simulation outcomes demonstrated that the proposed algorithm not only converges more rapidly than conventional reinforcement learning counterparts but also delivers superior throughput. In specific numerical outcomes, the algorithm secured a 99% QoS guarantee rate for each terminal user, a significant improvement over existing approaches.

On the practical implications frontier, the findings from this paper hold particular relevance for the development of UAV-assisted MEC systems targeting areas with varying communication infrastructure qualities, such as rural and disaster-stricken regions. The algorithm's robustness to different speeds of TU motion also highlights its applicability in diverse and dynamic environments, strengthening the case for UAVs as flexible and mobile MEC nodes. The theoretical implications are equally profound, as the results offer insights into efficiently leveraging DRL within MEC contexts—a challenge compounded by the high-dimensional state-action spaces and dynamic variance associated with mobile environments.

Future explorations could explore the integration of this DRL-based framework with other application areas involving UAVs beyond MEC, such as disaster relief operations or dynamic environmental monitoring. Additionally, further research might explore the scalability aspects of such algorithms in scenarios involving multiple UAVs or heterogeneous MEC network conditions.

Overall, the paper contributes significantly to the broader field of UAV-enabled edge computing, showing promise in DRL's potential to achieve optimal system performance in complex, mobile, and resource-constrained network environments.

PDF Markdown

Path Planning for UAV-Mounted Mobile Edge Computing with Deep Reinforcement Learning (2001.10268v1)

Summary

Path Planning for UAV-Mounted Mobile Edge Computing with Deep Reinforcement Learning

Related Papers