Reachability-Enhanced Dynamic Potential Game

Updated 22 November 2025

RE-DPG is a multi-agent planning framework that integrates reachability analysis with dynamic potential game theory to ensure safe motion planning.
The framework employs a decentralized neighborhood-dominated iterative best response (ND-iBR) algorithm to guarantee finite convergence to an ε-Nash equilibrium.
Empirical evaluations demonstrate zero collisions, stable tracking performance, and real-world applicability in multi-agent robotic systems.

The Reachability-Enhanced Dynamic Potential Game (RE-DPG) framework provides a scalable, provably safe approach for multi-agent motion planning under uncertainty by embedding reachability analysis within a dynamic potential game formulation. RE-DPG addresses the computational complexity of coupled multi-agent planning, inter-agent dependencies, and safety under bounded disturbances, combining game-theoretic optimization, decentralized iterative best response dynamics, and ellipsoidal reachability-based safety constraints (Mai et al., 15 Nov 2025).

1. Formal Framework for Multi-Agent Planning Under Uncertainty

RE-DPG models a system of $N$ agents, indexed $i\in\mathcal{N} = \{1, ..., N\}$ , with discrete-time dynamics

$x_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}$

subject to bounded additive disturbance $w_i^{(t)}$ with $\|w_i^{(t)}\| \leq \overline{w}$ . Each agent aims to drive its state from $x_i^{(0)}$ to $x_i^F$ in $T$ steps, while optimizing a cost functional that reflects tracking, energy, and safety components: $J_i(x^{(0)}, (u_i, u_{-i})) = \sum_{t=0}^{T-1} L_i(x^{(t)}, u_i^{(t)}) + L_i^F(x^{(T)})$ Minimum separation $d^{Col}$ and speed limit $v^{Max}$ are enforced at all times under disturbance realizations.

2. Dynamic Potential Game Formulation

Individual agent costs decompose into tracking and symmetric coupling components: $L_i(x^{(t)}, u_i^{(t)}) = C_i^{Tr}(x_i^{(t)}, u_i^{(t)}) + \sum_{j\ne i} C_{i,j}^{Coup}(x_i^{(t)}, x_j^{(t)})$ with terminal analogues.

The game defines a global stage-potential $P(\cdot,\cdot)$ and terminal-potential $R(\cdot)$ : $P(x^{(t)}, u^{(t)}) = \sum_i C_i^{Tr}(x_i^{(t)}, u_i^{(t)}) + \sum_{i<j} C_{i,j}^{Coup}(x_i^{(t)}, x_j^{(t)})$

$R(x^{(T)}) = \sum_i C_i^{Tr,F}(x_i^{(T)}) + \sum_{i<j} C_{i,j}^{Coup,F}(x_i^{(T)}, x_j^{(T)})$

yielding a horizon-wise potential

$\Phi(u) = \sum_{t=0}^{T-1} P(x^{(t)}, u^{(t)}) + R(x^{(T)})$

Any unilateral control update alters $J_i$ and $\Phi$ identically. Hence, Nash equilibria (NE) correspond to stationary points of $\Phi(u)$ , and under standard assumptions (compactness, continuity), an NE, and thus an $\epsilon$ -NE, always exists.

3. Neighborhood-Dominated Iterative Best Response (ND-iBR)

The ND-iBR algorithm is an iterated $\varepsilon$ -Best Response (i $\varepsilon$ -BR) method to find $\varepsilon$ -Nash equilibria in a distributed, scalable manner.

At every iteration, each agent $i$ checks if a local improvement of at least $\varepsilon$ is possible: $J_i(u_i^k, u_{-i}^k) - \inf_{v_i} J_i(v_i, u_{-i}^k) \geq \varepsilon$ The best-response subproblem is solved locally: $u_i^{k+1} = \arg\min_{u_i} J_i(u_i, u_{-i}^k)$ and $u^k$ is updated only for the improving agent. This process is guaranteed to terminate in finitely many steps at an $\varepsilon$ -NE, due to monotonic $\varepsilon$ -decrease of the potential $\Phi$ , which is continuous and bounded below.

Agents restrict their coupling to local neighborhoods: $\mathcal{N}_i^{(t)} = \{ j\neq i \mid \|\mathbf{p}_i^{(t)} - \mathbf{p}_j^{(t)}\| < d^{Prox} \}$ where $d^{Prox} = 2 v^{Max}\Delta t$ , yielding each agent's local cost $\hat J_i$ that considers only neighbors $j \in \mathcal{N}_i^{(t)}$ . The decentralized ND-iBR procedure preserves convergence guarantees in this localized setting.

4. Multi-Agent Forward Reachable Set Analysis

To address safety under model and disturbance uncertainty, RE-DPG uses Multi-Agent Forward Reachable Sets (MA-FRS). For each agent, the linearized error dynamics under LQR feedback and bounded disturbances yield at each $t$ an ellipsoidal reachable set: $\mathcal{E}_i^{(t)} = \{x \mid (x-\bar x_i^{(t)})^T Q_i^{(t),-1} (x-\bar x_i^{(t)}) \leq 1\}$ with the shape matrix $Q_i^{(t)}$ propagated using Lyapunov and Minkowski-sum recursion.

Collision avoidance between agents $i$ and $j$ is encoded by requiring non-intersection of their position ellipsoids, enforced via: $\xi_{i,j}^{(t)} = (p_i^{(t)} - p_j^{(t)})^T ( \hat Q_i^{(t)} \boxplus \hat Q_j^{(t)} )^{-1} (p_i^{(t)} - p_j^{(t)}) - 1 > 0$ where $\hat Q_i^{(t)}$ is the position submatrix of $Q_i^{(t)}$ . Exponential penalty terms in the cost function soft-constrain $\xi_{i,j}^{(t)} > 0$ , preserving differentiability for gradient-based optimization and ensuring collision avoidance.

5. Theoretical Properties and Guarantees

The ND-iBR process provides finite-step convergence to an $\epsilon$ -NE, as the potential function $\Phi$ is decreased by at least $\epsilon$ per iteration and is bounded below. The use of MA-FRS-based constraints guarantees that, at any candidate $\epsilon$ -NE, all neighbor agent pairs will have positively separated reachable sets throughout the planning horizon, ensuring minimum separation under all admissible disturbances.

The framework accommodates decentralized computation while maintaining global safety, leveraging the structure of dynamic potential games for tractable optimization and stability.

6. Experimental Results and Empirical Evaluation

Simulation experiments in 3D environments ( $30\times30\times10$ m $^3$ , $N\in\{3,\ldots,15\}$ quadrotors, $T=50$ , receding horizon MPC length $20$, $\Delta t=0.2$ s) subjected to zero-mean truncated Gaussian disturbances ( $\sigma\in\{0.02, 0.05, 0.10, 0.15\}$ ) across 50 Monte Carlo runs demonstrated that RE-DPG achieved zero collisions, stable tracking cost, and competitive time efficiency. The computational requirement for MA-FRS propagation was approximately 102 ms per agent. Baseline comparisons included non-FRS penalties, PiLQR, and Swarm-DRL.

Real-world 2D hardware validation on Raspberry Pi 5-equipped 4WD vehicles using ROS-based distributed planning further confirmed the framework’s practical effectiveness. Scenarios included multi-vehicle intersection negotiation and non-neighbor cases, all of which resulted in collision-free execution (minimum pairwise distance $\geq 0.4$ m, zero near-collisions) and dynamic compliance with yielding behaviors.

7. Synthesis and Significance

RE-DPG systematically couples dynamic potential games, ND-iBR protocol, and FRS-based reachability to achieve robust multi-agent coordination. The framework delivers decentralized, scalable, and provably safe motion planning under stochastic disturbances, with theoretical and empirical evidence for finite convergence and minimum-separation guarantees. A plausible implication is that RE-DPG can be extended beyond multi-robot domains to general settings where collaborative agents require proactive safety under bounded uncertainty, provided that suitable local dynamics linearizations and coupling structures exist (Mai et al., 15 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

Game-Theoretic Safe Multi-Agent Motion Planning with Reachability Analysis for Dynamic and Uncertain Environments (Extended Version) (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Reachability-Enhanced Dynamic Potential Game (RE-DPG).