A Multi-Agent Reinforcement Learning Approach for Cooperative Air-Ground-Human Crowdsensing in Emergency Rescue (2505.06997v1)

Published 11 May 2025 in cs.AI

Abstract: Mobile crowdsensing is evolving beyond traditional human-centric models by integrating heterogeneous entities like unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). Optimizing task allocation among these diverse agents is critical, particularly in challenging emergency rescue scenarios characterized by complex environments, limited communication, and partial observability. This paper tackles the Heterogeneous-Entity Collaborative-Sensing Task Allocation (HECTA) problem specifically for emergency rescue, considering humans, UAVs, and UGVs. We introduce a novel ``Hard-Cooperative'' policy where UGVs prioritize recharging low-battery UAVs, alongside performing their sensing tasks. The primary objective is maximizing the task completion rate (TCR) under strict time constraints. We rigorously formulate this NP-hard problem as a decentralized partially observable Markov decision process (Dec-POMDP) to effectively handle sequential decision-making under uncertainty. To solve this, we propose HECTA4ER, a novel multi-agent reinforcement learning algorithm built upon a Centralized Training with Decentralized Execution architecture. HECTA4ER incorporates tailored designs, including specialized modules for complex feature extraction, utilization of action-observation history via hidden states, and a mixing network integrating global and local information, specifically addressing the challenges of partial observability. Furthermore, theoretical analysis confirms the algorithm's convergence properties. Extensive simulations demonstrate that HECTA4ER significantly outperforms baseline algorithms, achieving an average 18.42% increase in TCR. Crucially, a real-world case study validates the algorithm's effectiveness and robustness in dynamic sensing scenarios, highlighting its strong potential for practical application in emergency response.

Summary

A Multi-Agent Reinforcement Learning Approach for Cooperative Air–Ground–Human Crowdsensing in Emergency Rescue

This paper presents a sophisticated approach to enhancing task allocation for mobile crowdsensing within emergency rescue scenarios through the employment of multi-agent reinforcement learning (MARL). The authors address the Heterogeneous-Entity Collaborative-Sensing Task Allocation (HECTA) problem, a nuanced challenge due to the involvement of human participants alongside unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). The primary aim is maximizing the task completion rate (TCR) under stringent time constraints, a critical factor in real-world rescue operations.

Key Contributions

Hard-Cooperative Policy: This innovative policy prioritizes the recharging of low-battery UAVs by UGVs, which is crucial given the UAVs' limited energy capacity. This cooperative strategy is meticulously integrated into the task allocation process, ensuring continued UAV operation alongside UGV's sensing duties.
Dec-POMDP Formulation: The authors rigorously frame the HECTA problem as a decentralized partially observable Markov decision process (Dec-POMDP). This formulation is adept for sequential decision-making in environments plagued by uncertainty and limited communication. Given the NP-hard nature of the problem, the authors demonstrate the applicability of this model to handle the intricate dynamics of emergency response scenarios.
HECTA4ER Algorithm: At the core of their solution is the HECTA4ER algorithm, built upon a Centralized Training with Decentralized Execution (CTDE) architecture. This MARL-based solution incorporates several advanced features:
- Complex Feature Extraction: Through convolutional modules, the algorithm adeptly extracts environmental features, crucial for decision-making in dynamic rescue environments.
- Action-Observation History Utilization: Leveraging hidden states, the algorithm maintains a record of action-observation history, aiding in effective decision-making amidst partial observability.
- Mixing Network: A mixing network facilitates the integration of global and local information, enhancing the algorithm’s ability to navigate and strategize within its operational constraints efficiently.

Empirical Validation

The robustness of the HECTA4ER algorithm is proven through extensive simulations where it achieves substantial improvements over baseline methods. An average increase of 18.42% in TCR underscores its efficacy compared to existing algorithms. Moreover, real-world case studies further affirm the algorithm’s potential in practical applications, showcasing its adaptability and robustness in dynamic scenarios — attributes that are indispensable for effective emergency rescue operations.

Theoretical Implications and Future Directions

The authors pave the way for further exploration into collaborative MARL frameworks, suggesting potential enhancements through large model-enabled agents and deeper environmental modeling. These avenues promise to refine autonomous decision-making capabilities, making them even more aligned with the unpredictability and complexities inherent in real-world rescue tasks.

Conclusion

By blending advanced MARL techniques with cooperative policies tailored for UAV and UGV integrations, this paper presents a compelling solution to a pressing challenge in emergency response. The holistic approach, fortified by rigorous theoretical and empirical analyses, offers a promising pathway for improving the efficiency and effectiveness of rescue operations in environments where time and coordination are of essence.