HJ-Based Tracking Control

Updated 4 January 2026

HJ-based tracking control is a framework that uses Hamilton–Jacobi PDEs to formulate tracking as a dynamic game, ensuring provable error bounds and robustness.
The approach integrates advanced methodologies such as SOS relaxation, quadratic value approximations, and neural network critic learning to handle high-dimensional or uncertain systems.
Practical implementations yield explicit tracking error guarantees and safety margins, enabling real-time planning and consensus in multi-agent and nonlinear applications.

A Hamilton–Jacobi (HJ)-based tracking control algorithm refers to a class of control synthesis and analysis methods which leverage Hamilton–Jacobi partial differential equations (PDEs) to achieve provable tracking error guarantees, optimality, or robustness for high-dimensional or uncertain dynamical systems. These techniques are foundational across several domains, including safe real-time planning under model mismatch, data-driven optimal tracking in multi-agent systems, robust nonlinear tracking under input constraints, and model-free $H_{\infty}$ optimal tracking via reinforcement learning.

1. Fundamental Principles and Problem Setup

HJ-based tracking control unifies the concept of tracking a reference trajectory under uncertainty, disturbances, or model mismatch by minimizing a value function dictated by the system’s dynamic evolution. Common setups involve:

Model mismatch planning and tracking: Fast trajectory planning via a low-dimensional “planner” model $\dot{x}_p = f_p(x_p, u_p)$ paired with physical execution via a high-dimensional “tracker” model $\dot{x}_t = f_t(x_t, u_t)$ , with $x_p$ embedded in $x_t$ and the relative state $r = x_t - Qx_p$ (Singh et al., 2018).
Consensus tracking for multi-agent systems: Each agent observes its own and neighbors’ states, computes a tracking error $e_i(k)$ relative to both neighbors and leader under heterogeneous linear dynamics, and seeks Nash-optimal consensus tracking (Zhang et al., 2017).
Optimal tracking with robustness and actuator constraints: Nonlinear or uncertain affine-in-control plants augmented for error and reference, optimizing discounted cost subject to input and mismatch bounds (Mishra et al., 2019).
Nonlinear $H_{\infty}$ tracking via differential game: Affine nonlinear systems with unknown dynamics and disturbances, minimizing the $L_2$ -gain from disturbance to tracking error/control output via policy iteration (Wang, 2024).

In all cases, the tracking error evolution and optimal control synthesis derive from the HJ (Hamilton–Jacobi–Bellman or Isaacs) equations associated to a dynamic programming or differential game viewpoint.

2. Hamilton–Jacobi Reachability and Tracking Error Bounds

The canonical HJ-based framework formulates tracking control as a reachability problem or dynamic game, generating error bounds via PDE solutions:

HJ Differential Game: For planner–tracker systems, the reachability value function $V(\tau, r)$ solves

$\frac{\partial V}{\partial \tau}(\tau, r) + H(r, \nabla_r V(\tau, r)) = 0$

with Hamiltonian

$H(r, p) = \max_{u_p \in \mathcal{U}_p} \min_{u_t \in \mathcal{U}_t} p^T (f_t(r, u_t) - Q f_p(Q^+ r, u_p))$

characterizing the worst-case model mismatch. The zero sublevel set of the infinite-horizon $V$ is the smallest forward-invariant Tracking Error Bound (TEB) (Singh et al., 2018).

Multi-agent Nash-equilibrium tracking: The coupled HJ equations describe optimal consensus via the value function $V_i$ for each agent, given as

$V_i(e_i) = \min_{u_i} \{ e_i^T Q_i e_i + u_i^T R_i u_i + V_i(\bar{A}_i e_i + F_i u_i + \sum_{j \in N_i} E_{ij} u_j) \}$

yielding local control policies synthesized from the Riccati-like equations (Zhang et al., 2017).

Nonlinear optimal and robust tracking: The HJB residual $\mathcal{H}(z, u, V_z)$ encapsulates costs, system dynamics, bounds on uncertainty, and input constraints, enabling robust tracking (Mishra et al., 2019).
$H_{\infty}$ Tracking: The Isaacs equation for min–max dynamic games is

$X^T Q_r X - a V^{*}(X) + \nabla V^{*}(X)^T F(X) - \frac{1}{4} \nabla V^{*T} G R^{-1} G^T \nabla V^{*} + \frac{1}{4\gamma^2} \nabla V^{*T} K K^T \nabla V^{*} = 0$

the solution $V^*$ yields the state-feedback policy that ensures $L_2$ attenuation at level $\gamma$ (Wang, 2024).

3. Computational Methodologies: Relaxations, Approximations, and Data-Driven Algorithms

Direct solution of HJ PDEs is generally intractable for high-dimensional systems. Significant algorithmic advances include:

Sum-of-Squares (SOS) Relaxation: For planner–tracker systems, a polynomial certificate $V(r)$ and controller $K(r, u_p)$ are constructed. The forward invariance conditions and input constraints are relaxed via SOS programming on semialgebraic sets, employing polynomial multipliers and alternating bilinear optimization over $(V, K, E)$ via convex subproblems (Singh et al., 2018).
Quadratic Value Approximation for Multi-agent Tracking: In multi-agent scenarios, each agent’s value function $V_i(e_i)$ is quadratically approximated, resulting in Riccati-like equations for $P_i$ . These are iteratively updated using input/output Q-learning, leveraging sliding window data, least-squares regression, and value iteration entirely from measured I/O signals and without explicit models (Zhang et al., 2017).
Critic Neural Network for Uncertain Nonlinear Systems: The value function $V^*(z)$ is parameterized by a neural network with weights $\widehat{W}$ and regressor $\vartheta(z)$ , trained via gradient descent on the HJB residual error using a variable-gain law $\Gamma(e)$ adaptive to error magnitude, together with Lyapunov-stabilizing and bounding terms (Mishra et al., 2019).
Damped Newton and $\delta$ –Policy Iteration: Model-free HJ-based $H_{\infty}$ tracking is achieved by recasting the HJI PDE as a generalized Bellman equation solvable by damped-Newton iteration. Both on-policy and off-policy $\delta$ –policy iteration (with $\delta \in (0,1]$ controlling step size and region of convergence) admit least-squares solutions for network weights using only recorded state–input–disturbance trajectories. Actor–critic neural networks implement control policies and value functions without requiring system model knowledge (Wang, 2024).

4. Tracking Error Guarantees, Safety Margins, and Integration into Planning

A defining property of HJ-based tracking control is the production of explicit error bounds or invariant sets:

Tracking Error Bound (TEB): Once $V(r)$ and ellipsoid matrix $E$ are synthesized, the set $\{r: V(r) \leq 1\}$ is guaranteed forward invariant, projecting to the error domain as $\mathcal{E} = \{e: V(c^{-1}(e)) \leq 1\}$ or ellipsoid $\{z: z^T E z \leq 1\}$ , used as a safety margin in real-time planning (Singh et al., 2018).
Inflated Obstacles in Planner Space: Offline computation of controller $K$ , invariant $V$ , and ellipsoid $E$ enables real-time inflation of planner obstacles, ensuring any low-dimensional plan that avoids these inflated regions will be trackable by the actual (high-dimensional) system (Singh et al., 2018).
Consensus for Multi-agent Networks: Nash-optimal controller updates drive all agent and leader states to consensus, with tracking errors $e_i(k) \rightarrow 0$ as value iteration converges. Asymptotic convergence is established under persistent excitation and topology constraints (Zhang et al., 2017).
Uniform Ultimate Boundedness (UUB): Variable gain laws can achieve strictly tighter residual sets (lower tracking error oscillations and faster convergence) than fixed-rate gradient descent, with explicit Lyapunov analysis confirming the boundedness of both error and critic weights (Mishra et al., 2019).
$H_{\infty}$ Performance: Solution of the Isaacs equation and implementation of optimal policy $u^*$ guarantee closed-loop $L_2$ -gain from disturbance $d$ to $[e, u]$ is less than prescribed level $\gamma$ (Wang, 2024).

5. Numerical Experiments and Trade-Offs

HJ-based algorithms have been validated across domains, with extensive trade-off analysis:

System Scenario	HJ Method	SOS/ADP/NN-based Methods
5D car tracking 3D Dubins	$\sim$ 0.2m radius, 25h CPU	$\sim$ 0.48m radius, 5min CPU (Singh et al., 2018)
8D airplane tracking 4D Dubins	HJ intractable	$\sim$ 6m $\times$ 4m $\times$ 4.5m in <2h (Singh et al., 2018)
3-agent consensus (discrete)	Not considered	I/O Q-learning VI, 20–25 iterations (Zhang et al., 2017)
2D/6D nonlinear, UAV tracking	Constant gain: slow, larger oscillation	Variable gain: 5–10 $\times$ faster, 3–5 $\times$ smaller oscillation (Mishra et al., 2019)
Nonlinear $H_{\infty}$ (off-policy)	Requires model	NN-based, model-free, $\leq$ 100 iterations (Wang, 2024)

These results demonstrate that relaxations (SOS, ADP, RL, VI) substantially improve scalability to high-dimensional or uncertain systems at some cost in conservativeness (larger error sets or bounds), while providing rigorous guarantees and efficient learning.

6. Convergence, Stability, and Theoretical Guarantees

Convergence analysis and stability results underpin the reliability of HJ-based tracking control:

Alternating Bilinear SOS Optimization: Guaranteed monotonic improvement via slack minimization and trust regions, with certificates of forward invariance when all SOS constraints are tight (Singh et al., 2018).
Bellman contraction and value iteration: Proven asymptotic convergence in multi-agent settings under strong connectivity, observability, reachability, and persistent excitation conditions (Zhang et al., 2017).
Adaptive critic learning: Variable-gain descent provably tightens ultimate boundedness and enhances convergence speed. Lyapunov-based analysis guarantees UUB of system trajectories and weights (Mishra et al., 2019).
Newton-type policy iteration: Kantorovich’s theorem ensures convergence of $\delta$ –policy iteration for HJI equations when appropriate regularity and step-size conditions are met, with global stability for small $\delta$ and quadratic local convergence for $\delta = 1$ (Wang, 2024).
$H_{\infty}$ -guaranteed bounds: Solution of Isaacs PDEs yields policies with certified $L_2$ -attenuation of disturbances, as validated both theoretically and in simulation (Wang, 2024).

7. Technological Impact and Extensions

HJ-based tracking control algorithms maintain safety and performance guarantees for systems that were previously intractable due to dimensionality or uncertainty. SOS programming, data-based RL, and policy iteration permit the extension of reachability-theoretic guarantees to multi-agent, nonlinear, and model-free settings with scalability, thereby enabling real-time planning, robust consensus, and resilient tracking across autonomous vehicles, robot teams, and uncertain nonlinear platforms.

A plausible implication is that further improvements in expressive polynomial/SOS representation, deep function approximation in RL/NN methods, and tailored data-driven PDE solvers will continue to expand the applicability of HJ-based tracking control into complex, safety-critical domains, while maintaining rigorous error bounds and optimality certificates.