Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing Systems
This paper addresses the problem of task offloading in mobile edge computing (MEC) systems, focusing on non-divisible and delay-sensitive tasks. MEC systems allow mobile devices to offload computation tasks to nearby edge nodes to mitigate local processing limitations. However, deciding whether to offload and selecting which edge node to use can be complex due to the unpredictable load dynamics at the edge nodes. The paper proposes a distributed, model-free, deep reinforcement learning (DRL)-based algorithm that allows each mobile device to make offloading decisions independently, without requiring complete information about the tasks or decisions of other devices.
Task Offloading Challenge
The task offloading problem is a classic challenge in MEC, driven by two fundamental questions: whether a given task should be offloaded, and, if so, to which edge node the task should be directed. Offloaded tasks are subject to varying processing delays due to the competitive resource allocation at edge nodes. Existing methods often rely on fixed resource sharing assumptions or necessitate centralized control with global knowledge, which is impractical in dynamic contexts with unknown task arrivals and computational requirements.
Proposed DRL-Based Algorithm
This paper proposes a decentralized DRL approach utilizing deep Q-learning techniques, specifically designed to adapt to MEC environments with changing task loads and conditions. The DRL algorithm leverages long short-term memory (LSTM), dueling deep Q-network (DQN), and double-DQN enhancements to improve learning efficiency and prediction accuracy of the expected long-term costs in decentralized task offloading.
- State, Action, and Cost Formulation: Each mobile device models its environment including task attributes, queue statuses, and historical load at edge nodes, which define its state. Decisions are represented as actions, i.e., whether to process the task locally or offload it, and corresponding costs incorporate task delays and penalties for missed deadlines.
- Learning Mechanism: The proposed algorithm builds a neural network to map state-action pairs to Q-values, estimating the expected cost of each action given the observed state. Using recent information on task sizes, queue states, and edge node loads from multiple episodes, the algorithm iteratively refines its policy using experiences stored in a replay memory. This emulates a distributed reinforcement learning environment, alleviating the need for global knowledge.
- Neural Network Architecture: Integration of LSTM layers captures the sequential nature of edge load variations, whereas the division into separate state-value and action-advantage components (dueling-DQN) enhances the neural network's ability to evaluate the cost-benefit trade-offs in task decisions. A double-DQN approach minimizes bias when evaluating optimal actions.
Results and Implications
Simulation results reveal substantial improvements for the proposed DRL algorithm in reducing task drops and minimizing delays across various conditions, including changes in task arrival rates, computational loads, and system capacities. Under scenarios with 50 mobile devices and five edge nodes, the method reduced task drop rates by up to 95.4% and lowered average delays by up to 30.1% compared to existing methods like the Potential Game-based Offloading Algorithm (PGOA) and User-Level Online Offloading Framework (ULOOF).
Future Directions
The proposed DRL approach not only enhances resource efficiency in MEC systems but also opens avenues for further research, notably in cooperative learning across devices to accelerate convergence and improve robustness under more complex network dynamics. Future studies might focus on integrating multi-agent reinforcement learning frameworks to facilitate cooperative offloading strategies, thereby advancing toward a holistic MEC ecosystem capable of dynamically distributed resource management and task scheduling.
In conclusion, this work significantly progresses the field of mobile edge computing by demonstrating an effective, autonomous decision-making framework for task offloading, which has powerful implications for enhancing the scalability and performance of MEC systems in real-world deployments.