Papers
Topics
Authors
Recent
Search
2000 character limit reached

Reachability-Enhanced Dynamic Potential Game

Updated 22 November 2025
  • RE-DPG is a multi-agent planning framework that integrates reachability analysis with dynamic potential game theory to ensure safe motion planning.
  • The framework employs a decentralized neighborhood-dominated iterative best response (ND-iBR) algorithm to guarantee finite convergence to an ε-Nash equilibrium.
  • Empirical evaluations demonstrate zero collisions, stable tracking performance, and real-world applicability in multi-agent robotic systems.

The Reachability-Enhanced Dynamic Potential Game (RE-DPG) framework provides a scalable, provably safe approach for multi-agent motion planning under uncertainty by embedding reachability analysis within a dynamic potential game formulation. RE-DPG addresses the computational complexity of coupled multi-agent planning, inter-agent dependencies, and safety under bounded disturbances, combining game-theoretic optimization, decentralized iterative best response dynamics, and ellipsoidal reachability-based safety constraints (Mai et al., 15 Nov 2025).

1. Formal Framework for Multi-Agent Planning Under Uncertainty

RE-DPG models a system of NN agents, indexed i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}, with discrete-time dynamics

xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}

subject to bounded additive disturbance wi(t)w_i^{(t)} with ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}. Each agent aims to drive its state from xi(0)x_i^{(0)} to xiFx_i^F in TT steps, while optimizing a cost functional that reflects tracking, energy, and safety components: Ji(x(0),(ui,u−i))=∑t=0T−1Li(x(t),ui(t))+LiF(x(T))J_i(x^{(0)}, (u_i, u_{-i})) = \sum_{t=0}^{T-1} L_i(x^{(t)}, u_i^{(t)}) + L_i^F(x^{(T)}) Minimum separation dCold^{Col} and speed limit i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}0 are enforced at all times under disturbance realizations.

2. Dynamic Potential Game Formulation

Individual agent costs decompose into tracking and symmetric coupling components: i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}1 with terminal analogues.

The game defines a global stage-potential i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}2 and terminal-potential i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}3: i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}4

i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}5

yielding a horizon-wise potential

i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}6

Any unilateral control update alters i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}7 and i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}8 identically. Hence, Nash equilibria (NE) correspond to stationary points of i∈N={1,...,N}i\in\mathcal{N} = \{1, ..., N\}9, and under standard assumptions (compactness, continuity), an NE, and thus an xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}0-NE, always exists.

3. Neighborhood-Dominated Iterative Best Response (ND-iBR)

The ND-iBR algorithm is an iterated xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}1-Best Response (ixi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}2-BR) method to find xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}3-Nash equilibria in a distributed, scalable manner.

At every iteration, each agent xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}4 checks if a local improvement of at least xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}5 is possible: xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}6 The best-response subproblem is solved locally: xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}7 and xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}8 is updated only for the improving agent. This process is guaranteed to terminate in finitely many steps at an xi(t+1)=fi(xi(t),ui(t)),xi(t)∈Xi⊂Rnx,ui(t)∈Ui⊂Rnux_i^{(t+1)} = f_i(x_i^{(t)}, u_i^{(t)}), \quad x_i^{(t)}\in\mathcal{X}_i\subset\mathbb{R}^{n_x},\quad u_i^{(t)}\in\mathcal{U}_i\subset\mathbb{R}^{n_u}9-NE, due to monotonic wi(t)w_i^{(t)}0-decrease of the potential wi(t)w_i^{(t)}1, which is continuous and bounded below.

Agents restrict their coupling to local neighborhoods: wi(t)w_i^{(t)}2 where wi(t)w_i^{(t)}3, yielding each agent's local cost wi(t)w_i^{(t)}4 that considers only neighbors wi(t)w_i^{(t)}5. The decentralized ND-iBR procedure preserves convergence guarantees in this localized setting.

4. Multi-Agent Forward Reachable Set Analysis

To address safety under model and disturbance uncertainty, RE-DPG uses Multi-Agent Forward Reachable Sets (MA-FRS). For each agent, the linearized error dynamics under LQR feedback and bounded disturbances yield at each wi(t)w_i^{(t)}6 an ellipsoidal reachable set: wi(t)w_i^{(t)}7 with the shape matrix wi(t)w_i^{(t)}8 propagated using Lyapunov and Minkowski-sum recursion.

Collision avoidance between agents wi(t)w_i^{(t)}9 and ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}0 is encoded by requiring non-intersection of their position ellipsoids, enforced via: ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}1 where ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}2 is the position submatrix of ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}3. Exponential penalty terms in the cost function soft-constrain ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}4, preserving differentiability for gradient-based optimization and ensuring collision avoidance.

5. Theoretical Properties and Guarantees

The ND-iBR process provides finite-step convergence to an ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}5-NE, as the potential function ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}6 is decreased by at least ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}7 per iteration and is bounded below. The use of MA-FRS-based constraints guarantees that, at any candidate ∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}8-NE, all neighbor agent pairs will have positively separated reachable sets throughout the planning horizon, ensuring minimum separation under all admissible disturbances.

The framework accommodates decentralized computation while maintaining global safety, leveraging the structure of dynamic potential games for tractable optimization and stability.

6. Experimental Results and Empirical Evaluation

Simulation experiments in 3D environments (∥wi(t)∥≤w‾\|w_i^{(t)}\| \leq \overline{w}9 mxi(0)x_i^{(0)}0, xi(0)x_i^{(0)}1 quadrotors, xi(0)x_i^{(0)}2, receding horizon MPC length xi(0)x_i^{(0)}3, xi(0)x_i^{(0)}4s) subjected to zero-mean truncated Gaussian disturbances (xi(0)x_i^{(0)}5) across 50 Monte Carlo runs demonstrated that RE-DPG achieved zero collisions, stable tracking cost, and competitive time efficiency. The computational requirement for MA-FRS propagation was approximately 102 ms per agent. Baseline comparisons included non-FRS penalties, PiLQR, and Swarm-DRL.

Real-world 2D hardware validation on Raspberry Pi 5-equipped 4WD vehicles using ROS-based distributed planning further confirmed the framework’s practical effectiveness. Scenarios included multi-vehicle intersection negotiation and non-neighbor cases, all of which resulted in collision-free execution (minimum pairwise distance xi(0)x_i^{(0)}6 m, zero near-collisions) and dynamic compliance with yielding behaviors.

7. Synthesis and Significance

RE-DPG systematically couples dynamic potential games, ND-iBR protocol, and FRS-based reachability to achieve robust multi-agent coordination. The framework delivers decentralized, scalable, and provably safe motion planning under stochastic disturbances, with theoretical and empirical evidence for finite convergence and minimum-separation guarantees. A plausible implication is that RE-DPG can be extended beyond multi-robot domains to general settings where collaborative agents require proactive safety under bounded uncertainty, provided that suitable local dynamics linearizations and coupling structures exist (Mai et al., 15 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Reachability-Enhanced Dynamic Potential Game (RE-DPG).