Multi-UAV Cooperative Path Planning

Updated 6 December 2025

Multi-UAV Cooperative Path Planning (MUCPP) is a systematic method that formulates multi-agent optimization problems to generate coordinated, collision-free trajectories for UAV teams.
It leverages decomposition techniques like 3D slicing, grid partitioning, and metaheuristics to handle constraints such as collision avoidance, energy limits, and real-time adaptation.
Empirical validations demonstrate that these advanced strategies reduce mission duration and energy consumption, making them effective for tasks like infrastructure inspection and disaster response.

Multi-UAV Cooperative Path Planning (MUCPP) is the systematic generation of coordinated, often collision-free trajectories for a team of unmanned aerial vehicles (UAVs) to jointly accomplish spatial tasks such as area coverage, infrastructure inspection, data collection, or dynamic servicing of ground demands. Modern MUCPP research spans combinatorial optimization, distributed algorithms, trajectory planning under vehicle and mission constraints, and often leverages advances in artificial intelligence, game theory, and robotic systems. Rigorous experimentation and theoretical guarantees are provided for both static and dynamic, structured and unstructured environments.

1. Mathematical Formulations and Problem Classes

MUCPP problems are typically formalized as coupled multi-agent combinatorial-optimization tasks augmented with motion and task constraints. Representative formalizations include:

Set Covering Vehicle Routing Problems (SC-VRP): Given a sampled set of coverage primitives or waypoints $P = \{p_1,\ldots,p_n\}$ on a 3D structure, and a fleet of UAVs $V = \{v_1,\ldots,v_m\}$ , the objective is to minimize the maximum or total path cost subject to full area coverage, per-UAV constraints, and collision avoidance. Let $x_{ij}^k$ be binary variables representing assignment of (directed) path segment from $i$ to $j$ to UAV $k$ (Jing et al., 2020).

$\min_{\{x\}}\;\max_{k=1\dots m}\sum_{i=1}^{n}\sum_{j=1}^{n} d_{ij} x_{ij}^k$

Subject to: coverage, flow-continuity, and binary constraints.
Multi-agent Traveling Salesman Problem (mTSP) and Multiple-Set TSP (MS-TSP): In area decomposition methods, the AOI is partitioned into cells; each UAV must cover a unique subset. The optimization aims to minimize total or maximum mission energy or distance, often with per-UAV battery constraints (Datsko et al., 16 Feb 2024).
Multi-Objective Optimization: Trade-offs are encoded via composite loss functions, such as

$J(\Pi) = \alpha \sum_{k=1}^K L(\pi_k) + (1-\alpha) \max_{k} L(\pi_k)$

for balancing total distance versus makespan (Li et al., 29 Nov 2025).
Dynamic settings: Formulated over discrete or continuous time with evolving demand sets, scheduling variables, and possibly battery charging constraints (Wang et al., 2018).

2. Core Methodologies

2.1 Decomposition and Task Allocation

3D Slicing and Graph Clustering: For complex infrastructure inspection, a prior 3D model is sliced by horizontal planes, then intersection loops (branches) are identified via spectral graph techniques (graph Laplacian, eigenvalue analysis, k-means clustering). This enables branch-to-UAV assignment to ensure contiguous, non-overlapping coverage and exploits structural decomposability (Mansouri et al., 2016).
Cellular and Grid-Based Partitioning: Boustrophedon or grid-based decomposition discretizes regions, with partitioning (e.g., Lloyd’s k-means or DARP algorithms) to balance workload and respect no-fly zones or sensor heterogeneity (Collins et al., 2021, Apostolidis et al., 2022). Auction mechanisms resolve overlapping or conflict cells.
Clustering for Scalability: K-means or geometric clustering on task locations can pre-allocate sets to each UAV, ensuring complexity does not scale combinatorially with problem size (Liu et al., 3 Jun 2025).

2.2 Trajectory Generation and Optimization

Sampling-Based Planners: Sample-and-connect methods generate sparse roadmaps (PRM), with connection edges representing feasible, visibility-annotated path fragments. Solutions are encoded and optimized, e.g., with random-key genetic algorithms (BRKGA) augmented with local 2-opt heuristics (Jing et al., 2020).
Metaheuristics:
- Ant Colony Optimization: Simultaneously constructs multi-UAV tours using pheromone-guided probabilistic search, integrating formation and overlap constraints into the edge cost (Bui et al., 13 Feb 2024).
- Particle Swarm Optimization: Either directly over independent or rigid-formation agent trajectories (possibly in a game-theoretic stag hunt or formation-centric framework) (Nguyen et al., 2022, Hoang, 10 Jan 2025).
Dynamic Programming: Exact or greedily-iterated DP solutions for both static and dynamic demand settings, with approximation ratios established for greedy assignment schemes (Wang et al., 2018).
Rapidly-exploring Random Tree (RRT) Extensions: Multi-goal RRT can efficiently find feasible paths for a fleet subject to maneuver and obstacle constraints, smoothing with Bézier interpolation inside provable safety zones (Khuat et al., 16 Apr 2025).

2.3 Energy, Communication, and Physical Constraints

Energy-Aware Planning: Models incorporate empirically validated per-UAV energy functions as a function of speed and maneuvering, permitting energy-minimizing or battery-bounded multi-agent coverage (Datsko et al., 16 Feb 2024, Samshad et al., 5 Nov 2024).
Connectivity-Awareness: Continuous inter-UAV communication constraints are imposed by minimizing the required communication radius along the entire mission and estimating it via the maximum edge length in a time-varying Euclidean minimum spanning tree built on UAV locations (Samshad et al., 5 Nov 2024, Wu et al., 2019).
Dubins Vehicle Constraints: Path-planning and task-assignment are tightly coupled via Dubins connection length computation, ensuring all motion segments respect minimum turn-radius and heading constraints (Liu et al., 3 Jun 2025).

3. Distributed, Decentralized, and Real-Time Algorithms

Distributed Partition and Planning: Fully synchronous two-round algorithms can distribute unvisited goals among UAVs, dynamically and efficiently partitioning labor under local energy estimates (Zhao et al., 2018).
Online Adaptation and Emergency Handling: By integrating clustering and quick assignment, certain planners provide sub-millisecond reallocation when UAVs fail or new tasks emerge, guaranteeing continuous progress under adversarial or real-time mission perturbations (Liu et al., 3 Jun 2025).
Deep Reinforcement Learning (DRL): Dec-POMDP formulations enable multi-UAV teams to learn decentralized policies using global and local map-based neural representations, resulting in emergent task division without explicit communication (Bayerlein et al., 2020).

4. Objective Trade-Offs and Fairness

Multi-objective formulations capture trade-offs between global efficiency (total distance/energy) and fairness (workload balancing or makespan):

Method / Objective	Efficiency (Total Dist/Energy)	Fairness (Makespan)	Model Type / Key Reference
Iterative-Exchange A*	↓1773.25 ± 167.57	↓752.15 ± 62.15	Multi-exchange local search, (Li et al., 29 Nov 2025)
LPT-Balanced	2003.93 ± 194.43	729.87 ± 52.76	Load-balancing, (Li et al., 29 Nov 2025)
Hungarian-Insertion	1781.77 ± 156.54	1104.92 ± 326.14	Global optimization
BRKGA+2-opt (SC-VRP)	390.6 (T4,5 UAVs, m)	Implied via min–max	Sampling+GA, (Jing et al., 2020)

Iterative local-exchange frameworks yield dominance in composite objectives, balancing both efficiency and fairness and producing state-of-the-art results across diverse terrain and instance complexity.

5. Practical Constraints, Experimental Validation, and Applications

Constraints Handled:
- Inter-agent collision avoidance via explicit spatial partitioning or minimum-separation terms in the cost function (Bui et al., 13 Feb 2024, Mansouri et al., 2016).
- No-fly zones, terrain awareness, and variable task footprints (Collins et al., 2021, Apostolidis et al., 2022, Samshad et al., 5 Nov 2024).
- Physical dynamics (turn radius, vehicle kinematics, altitude constraints) (Khuat et al., 16 Apr 2025, Liu et al., 3 Jun 2025).
- Communication limits: Minimum required mesh/radio range is optimized or strictly enforced (Samshad et al., 5 Nov 2024, Wu et al., 2019).
Empirical Findings:
- Increasing the number of UAVs generally produces near-linear reductions in makespan until limited by the problem decomposition (e.g., number of branches or subregions) (Mansouri et al., 2016).
- Advanced planners routinely outperform naive nearest-neighbor or greedy schemes by 20–50% in total energy/distance or mission duration (Zhao et al., 2018, Bui et al., 13 Feb 2024, Jing et al., 2020).
- Energy estimation models designed from hardware data yield 95–97% match between predicted and real flight energy/mission times (Datsko et al., 16 Feb 2024, Samshad et al., 5 Nov 2024).
- Real-world field and flight experiments consistently confirm simulation results for both distributed and centralized systems (Mansouri et al., 2016, Wu et al., 2019, Apostolidis et al., 2022, Khuat et al., 16 Apr 2025).
Application Domains: Infrastructure inspection (wind turbines, towers), post-disaster surveying, search-and-rescue, large-area monitoring, formation photogrammetry, dynamic service to time-constrained demands, and coordinated data collection from distributed IoT devices.

6. Extensions, Challenges, and Future Directions

Dynamic/Uncertain Environments: Online replanning, receding-horizon approaches, and learning-based adaptation (e.g., DRL) are introduced for increased robustness to moving obstacles, time-varying demand, task failures, and UAV losses (Bayerlein et al., 2020, Liu et al., 3 Jun 2025, Zhao et al., 2018).
Energy/Comms/Obstacle Generalization: Integrating heterogeneous UAVs, full 3D flight, multi-objective and constraint-aware partitioning, and real-time deconfliction remain ongoing research challenges (Samshad et al., 5 Nov 2024, Datsko et al., 16 Feb 2024, Jing et al., 2020).
Algorithmic Scalability: Methods such as SCoPP and grid/cluster pre-processing demonstrate empirical scalability to ≥100 UAVs and thousands of targets with <2 minutes centralized compute (Collins et al., 2021, Apostolidis et al., 2022).
Multi-agent Game Theory and Formation Control: Game-theoretic frameworks (e.g., Pareto-optimal stag hunt games, formation-centric PSO) provide principled convergence to collaborative equilibria in both rigid and flexible formations (Nguyen et al., 2022, Hoang, 10 Jan 2025).
Open Source Platforms and Code: Implementation code for many major recent algorithms is openly available, facilitating further research and deployment (Bui et al., 13 Feb 2024, Khuat et al., 16 Apr 2025, Apostolidis et al., 2022).

Key Limitations

Most current models assume static, prior-known environments or structures, full model communication, and identical hardware; dynamic and adversarial scenarios, 3D wind, on-board vision-based reallocation, and partial observability remain areas of active research.

7. Summary Table of Key MUCPP Methods

Approach	Decomposition	Optimization Core	Constraints/Features	Paper
3D Model Slicing + Graph Clust.	Slices, Loops	Spectral, K-means, AssignAlg	3D offset, yaw, collision-free	(Mansouri et al., 2016)
Sampling+GA (SC-VRP)	PRM, Patch Visib.	Random-key GA + 2-opt	min–max tour, patch cover	(Jing et al., 2020)
Ant Colony Optimization	Viewpoint Sequence	ACO, Extended mTSP	Formation, overlap, safety	(Bui et al., 13 Feb 2024)
Iterative Exchange	Task Assignment	Multi-local exchanges, A*	Composite efficiency+fairness	(Li et al., 29 Nov 2025)
Dubins+Clustered Assignment	Geometric Clusters	Greedy/Hungarian/Auction	Heading, Dubins curves, real-time	(Liu et al., 3 Jun 2025)
Deep RL	Dec-POMDP, Grid	Double DQN, ConvNets	Partial obs., emergent task split	(Bayerlein et al., 2020)

MUCPP is thus an active research area at the interface of robotics, combinatorial optimization, multi-agent systems, and AI, with both strong theoretical underpinnings and substantial experimental validation across applications, environments, and algorithmic paradigms.