Papers
Topics
Authors
Recent
2000 character limit reached

Reachability-Enhanced Dynamic Potential Game

Updated 22 November 2025
  • RE-DPG is a multi-agent planning framework that integrates reachability analysis with dynamic potential game theory to ensure safe motion planning.
  • The framework employs a decentralized neighborhood-dominated iterative best response (ND-iBR) algorithm to guarantee finite convergence to an ε-Nash equilibrium.
  • Empirical evaluations demonstrate zero collisions, stable tracking performance, and real-world applicability in multi-agent robotic systems.

The Reachability-Enhanced Dynamic Potential Game (RE-DPG) framework provides a scalable, provably safe approach for multi-agent motion planning under uncertainty by embedding reachability analysis within a dynamic potential game formulation. RE-DPG addresses the computational complexity of coupled multi-agent planning, inter-agent dependencies, and safety under bounded disturbances, combining game-theoretic optimization, decentralized iterative best response dynamics, and ellipsoidal reachability-based safety constraints (Mai et al., 15 Nov 2025).

1. Formal Framework for Multi-Agent Planning Under Uncertainty

RE-DPG models a system of NN agents, indexed iN={1,...,N}i\in\mathcal{N} = \{1, ..., N\}, with discrete-time dynamics

xi(t+1)=fi(xi(t),ui(t)),xi(t)XiRnx,ui(t)UiRnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}

subject to bounded additive disturbance wi(t)w_i^{(t)} with wi(t)w\|w_i^{(t)}\| \leq \overline{w}. Each agent aims to drive its state from xi(0)x_i^{(0)} to xiFx_i^F in TT steps, while optimizing a cost functional that reflects tracking, energy, and safety components: Ji(x(0),(ui,ui))=t=0T1Li(x(t),ui(t))+LiF(x(T))J_i(x^{(0)}, (u_i, u_{-i})) = \sum_{t=0}^{T-1} L_i(x^{(t)}, u_i^{(t)}) + L_i^F(x^{(T)}) Minimum separation dCold^{Col} and speed limit vMaxv^{Max} are enforced at all times under disturbance realizations.

2. Dynamic Potential Game Formulation

Individual agent costs decompose into tracking and symmetric coupling components: Li(x(t),ui(t))=CiTr(xi(t),ui(t))+jiCi,jCoup(xi(t),xj(t))L_i(x^{(t)}, u_i^{(t)}) = C_i^{Tr}(x_i^{(t)}, u_i^{(t)}) + \sum_{j\ne i} C_{i,j}^{Coup}(x_i^{(t)}, x_j^{(t)}) with terminal analogues.

The game defines a global stage-potential P(,)P(\cdot,\cdot) and terminal-potential R()R(\cdot): P(x(t),u(t))=iCiTr(xi(t),ui(t))+i<jCi,jCoup(xi(t),xj(t))P(x^{(t)}, u^{(t)}) = \sum_i C_i^{Tr}(x_i^{(t)}, u_i^{(t)}) + \sum_{i<j} C_{i,j}^{Coup}(x_i^{(t)}, x_j^{(t)})

R(x(T))=iCiTr,F(xi(T))+i<jCi,jCoup,F(xi(T),xj(T))R(x^{(T)}) = \sum_i C_i^{Tr,F}(x_i^{(T)}) + \sum_{i<j} C_{i,j}^{Coup,F}(x_i^{(T)}, x_j^{(T)})

yielding a horizon-wise potential

Φ(u)=t=0T1P(x(t),u(t))+R(x(T))\Phi(u) = \sum_{t=0}^{T-1} P(x^{(t)}, u^{(t)}) + R(x^{(T)})

Any unilateral control update alters JiJ_i and Φ\Phi identically. Hence, Nash equilibria (NE) correspond to stationary points of Φ(u)\Phi(u), and under standard assumptions (compactness, continuity), an NE, and thus an ϵ\epsilon-NE, always exists.

3. Neighborhood-Dominated Iterative Best Response (ND-iBR)

The ND-iBR algorithm is an iterated ε\varepsilon-Best Response (iε\varepsilon-BR) method to find ε\varepsilon-Nash equilibria in a distributed, scalable manner.

At every iteration, each agent ii checks if a local improvement of at least ε\varepsilon is possible: Ji(uik,uik)infviJi(vi,uik)εJ_i(u_i^k, u_{-i}^k) - \inf_{v_i} J_i(v_i, u_{-i}^k) \geq \varepsilon The best-response subproblem is solved locally: uik+1=argminuiJi(ui,uik)u_i^{k+1} = \arg\min_{u_i} J_i(u_i, u_{-i}^k) and uku^k is updated only for the improving agent. This process is guaranteed to terminate in finitely many steps at an ε\varepsilon-NE, due to monotonic ε\varepsilon-decrease of the potential Φ\Phi, which is continuous and bounded below.

Agents restrict their coupling to local neighborhoods: Ni(t)={jipi(t)pj(t)<dProx}\mathcal{N}_i^{(t)} = \{ j\neq i \mid \|\mathbf{p}_i^{(t)} - \mathbf{p}_j^{(t)}\| < d^{Prox} \} where dProx=2vMaxΔtd^{Prox} = 2 v^{Max}\Delta t, yielding each agent's local cost J^i\hat J_i that considers only neighbors jNi(t)j \in \mathcal{N}_i^{(t)}. The decentralized ND-iBR procedure preserves convergence guarantees in this localized setting.

4. Multi-Agent Forward Reachable Set Analysis

To address safety under model and disturbance uncertainty, RE-DPG uses Multi-Agent Forward Reachable Sets (MA-FRS). For each agent, the linearized error dynamics under LQR feedback and bounded disturbances yield at each tt an ellipsoidal reachable set: Ei(t)={x(xxˉi(t))TQi(t),1(xxˉi(t))1}\mathcal{E}_i^{(t)} = \{x \mid (x-\bar x_i^{(t)})^T Q_i^{(t),-1} (x-\bar x_i^{(t)}) \leq 1\} with the shape matrix Qi(t)Q_i^{(t)} propagated using Lyapunov and Minkowski-sum recursion.

Collision avoidance between agents ii and jj is encoded by requiring non-intersection of their position ellipsoids, enforced via: ξi,j(t)=(pi(t)pj(t))T(Q^i(t)Q^j(t))1(pi(t)pj(t))1>0\xi_{i,j}^{(t)} = (p_i^{(t)} - p_j^{(t)})^T ( \hat Q_i^{(t)} \boxplus \hat Q_j^{(t)} )^{-1} (p_i^{(t)} - p_j^{(t)}) - 1 > 0 where Q^i(t)\hat Q_i^{(t)} is the position submatrix of Qi(t)Q_i^{(t)}. Exponential penalty terms in the cost function soft-constrain ξi,j(t)>0\xi_{i,j}^{(t)} > 0, preserving differentiability for gradient-based optimization and ensuring collision avoidance.

5. Theoretical Properties and Guarantees

The ND-iBR process provides finite-step convergence to an ϵ\epsilon-NE, as the potential function Φ\Phi is decreased by at least ϵ\epsilon per iteration and is bounded below. The use of MA-FRS-based constraints guarantees that, at any candidate ϵ\epsilon-NE, all neighbor agent pairs will have positively separated reachable sets throughout the planning horizon, ensuring minimum separation under all admissible disturbances.

The framework accommodates decentralized computation while maintaining global safety, leveraging the structure of dynamic potential games for tractable optimization and stability.

6. Experimental Results and Empirical Evaluation

Simulation experiments in 3D environments (30×30×1030\times30\times10 m3^3, N{3,,15}N\in\{3,\ldots,15\} quadrotors, T=50T=50, receding horizon MPC length $20$, Δt=0.2\Delta t=0.2s) subjected to zero-mean truncated Gaussian disturbances (σ{0.02,0.05,0.10,0.15}\sigma\in\{0.02, 0.05, 0.10, 0.15\}) across 50 Monte Carlo runs demonstrated that RE-DPG achieved zero collisions, stable tracking cost, and competitive time efficiency. The computational requirement for MA-FRS propagation was approximately 102 ms per agent. Baseline comparisons included non-FRS penalties, PiLQR, and Swarm-DRL.

Real-world 2D hardware validation on Raspberry Pi 5-equipped 4WD vehicles using ROS-based distributed planning further confirmed the framework’s practical effectiveness. Scenarios included multi-vehicle intersection negotiation and non-neighbor cases, all of which resulted in collision-free execution (minimum pairwise distance 0.4\geq 0.4 m, zero near-collisions) and dynamic compliance with yielding behaviors.

7. Synthesis and Significance

RE-DPG systematically couples dynamic potential games, ND-iBR protocol, and FRS-based reachability to achieve robust multi-agent coordination. The framework delivers decentralized, scalable, and provably safe motion planning under stochastic disturbances, with theoretical and empirical evidence for finite convergence and minimum-separation guarantees. A plausible implication is that RE-DPG can be extended beyond multi-robot domains to general settings where collaborative agents require proactive safety under bounded uncertainty, provided that suitable local dynamics linearizations and coupling structures exist (Mai et al., 15 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Reachability-Enhanced Dynamic Potential Game (RE-DPG).