Hamilton-Jacobi-Bellman QVI

Updated 30 December 2025

Hamilton-Jacobi-Bellman Quasi-Variational Inequality is a nonlinear PDE with nonlocal impulse operators that unifies continuous control with discrete interventions.
It is derived from dynamic programming principles, modeling state evolution via both drift-diffusion and instantaneous impulse actions in optimal control scenarios.
Numerical schemes for HJBQVIs use implicit finite-difference methods and policy iteration to ensure stability, convergence, and robust handling of high-dimensional problems.

A Hamilton-Jacobi-Bellman Quasi-Variational Inequality (HJBQVI) is a nonlinear, typically degenerate partial differential equation (PDE) with nonlocal (impulse) terms, arising from dynamic programming formulations of optimal control and game problems that permit both continuous and impulse (singular) control actions. The HJBQVI unifies aspects of stochastic/deterministic control, game theory, and free boundary problems, and serves as the canonical object in impulse control, combined stochastic-impulse control, and differential games with options for discrete interventions. Mathematically, HJBQVIs are characterized by a variational structure: the solution is constrained by a nonlocal obstacle operator, resulting in variational inequalities with solution-dependent obstacles that encode optimal intervention policies.

1. Model Origins and Problem Classes

HJBQVIs are derived as dynamic programming equations for control or game problems in which the controller (or multiple agents) are allowed to change the system's state by both continuous actions (e.g., drift and diffusion controls) and discrete, state-jumping "impulses" (e.g., buying, selling, resetting, or blocking events). In the classical formulation, the state variable $x \in \mathbb{R}^n$ evolves (between impulses) according to a controlled ODE or SDE, possibly with running cost/reward $f(x)$ , until the agent applies an impulse, instantaneously changing the state and incurring a corresponding cost.

A prototypical example is the infinite-horizon two-player zero-sum impulse control game analyzed in "A Zero-Sum Deterministic Impulse Controls Game in Infinite Horizon with a New HJBI QVI" (Asri et al., 2021). Here, each player controls an impulse sequence (jump times and actions), and the value function is the solution to a QVI with two nonlocal obstacles. Analogous finite-horizon impulse games are found in (Cosso, 2012).

The general stochastic or deterministic impulse control problem—possibly with both continuous and impulse controls—includes state evolution

$dy(t) = b(y(t), u(t)) dt + \sigma(y(t), u(t)) dW_t$

except at discrete times, where an impulse $\xi$ is applied: $y(\tau^+) = y(\tau^-) + \xi$ , with cost $l(\tau, \xi)$ . The value function $V$ satisfies a quasi-variational inequality (QVI) with a nonlocal operator representing the optimization over impulse choices.

2. Mathematical Formulation of the HJBQVI

The canonical HJBQVI, as established in (Azimzadeh et al., 2017, Zhou et al., 2020, Ieda, 2013, Meteykin, 24 Dec 2025), with state $x \in \mathbb{R}^n$ and time $t \in [0, T]$ , is: $\min \Big\{ V_t(t,x) + H\bigl(t, x, \nabla V(t, x)\bigr), N[V](t, x) - V(t, x) \Big\} = 0,$ with terminal condition $V(T, x) = h(x)$ . The nonlocal obstacle operator $N$ takes forms such as

$N[V](t, x) = \sup_{\xi \in \Xi} \big\{ V(t, x + \xi) - K(\xi) \big\},$

for maximization (or $\inf$ for minimization/QVI with dual obstacles). The Hamiltonian $H(t, x, p)$ includes the infinitesimal generator for the continuous dynamics and may take a supremum over conventional control variables.

In differential games with impulse controls, the QVI structure generalizes to double-obstacle form (for example, deterministic games (Asri et al., 2021), stochastic games (Cosso, 2012)): $\max\Big\{ \lambda v(x) - Dv(x) \cdot b(x) - f(x),\ v(x) - H_{\sup} v(x),\ v(x) - H_{\inf} v(x) \Big\} = 0,$ where $H_{\sup}$ and $H_{\inf}$ are nonlocal maximization and minimization operators, respectively, representing optimal interventions by each player.

3. Viscosity Solution Theory and Uniqueness

Viscosity solution theory provides the correct analytic framework for HJBQVIs. The nonlocal obstacle term creates essential differences from classical PDEs and requires a careful treatment. Standard definitions for viscosity sub/supersolutions for QVIs may not yield comparison principles—uniqueness fails—unless the supersolution notion is strengthened to enforce the global nonlocal constraint $V \geq N[V]$ and to test the PDE only where the obstacle is inactive (Zhou et al., 2020).

For the deterministic HJBQVI:

Subsolution: $V$ is a viscosity subsolution if, whenever a test function touches $V$ from above at $(t_0, x_0)$ , $\min\{\varphi_t + H, N[V] - V\} \geq 0$ .
Supersolution (modified): $V$ is a viscosity supersolution if $V \geq N[V]$ , $V(T, x) \geq h(x)$ , and at any contact point with $V < N[V]$ , $\varphi_t + H \leq 0$ .

Under standard ellipticity, Lipschitz, and convexity conditions on data, there holds a comparison principle (i.e., subsolution $\leq$ supersolution), ensuring uniqueness and stability of viscosity solutions (Zhou et al., 2020, Azimzadeh et al., 2017).

4. Dynamic Programming, Dual Obstacles, and Impulse Game Structure

The derivation of the HJBQVI relies on the dynamic programming principle (DPP) for problems with impulse interventions. The DPP expresses the value function recursively in terms of both local evolution and the possibility of immediate impulses. This yields variational inequalities whose obstacles are implicit in the value function itself ("solution-dependent obstacles").

In two-player games, the double-obstacle QVI structure arises naturally (Cosso, 2012, Asri et al., 2021):

The inner obstacle encodes the one player's best impulse choice (sup over impulses minus cost).
The outer obstacle enforces the second player's intervention (inf over impulses plus cost). This produces a "max–min" or "double-obstacle" structure, with the Isaacs condition collapsed to a sequence of local min–max (or max–min) and nonlocal optimization.

If impulses are disallowed (costs set to infinity), HJBQVI reduces to the standard HJB or HJBI PDE. Allowing impulses fundamentally alters the analytic and qualitative structure, introducing free boundaries and non-smooth regions.

5. Numerical Discretization and Convergence Theory

Implicit finite-difference schemes are standard for HJBQVIs, favored for unconditional stability (no CFL-type timestep restriction) and monotonicity properties (Ieda, 2013, Azimzadeh et al., 2017, Meteykin, 24 Dec 2025). The typical discretization advances the value function fully implicitly in time, and the impulse (nonlocal) term is handled by an explicit maximization/minimization at each grid node.

The discretization structure involves:

Backward Euler time-stepping: $\partial_t v \approx (v^{n+1} - v^n) / \delta t$ .
Discrete spatial stencils for local generators (upwind for convection, centered for diffusion).
Nonlocal impulse operator realized by maximizing/minimizing over discretized jump sets and linear interpolation for off-grid states.

Convergence of implicit schemes to the unique viscosity solution is guaranteed under monotonicity, stability, and nonlocal consistency, as formalized by Barles–Souganidis-type analyses extended to nonlocal equations (Azimzadeh et al., 2017, Meteykin, 24 Dec 2025). Numerical convergence is first order in time and typically second order in space (under standard regularity assumptions and grid layout) (Ieda, 2013).

Policy iteration (Howard's algorithm) is the practical solution method: each iteration alternates between deducing optimal control/impulse actions and solving the resulting sparse linear system. Weakly-chained and diagonally-dominant matrices ensure geometric convergence in a finite number of steps per timestep (Meteykin, 24 Dec 2025, Ieda, 2013).

6. Applications and Illustrative Examples

HJBQVIs have been deployed in a wide spectrum of control and game settings, such as:

Impulse control in finance and economics: optimal dividend and reinsurance, irreversible investment, inventory management (Azimzadeh et al., 2017, Ieda, 2013).
Optimal forest harvesting: combined jump/non-jump and exit options with stochastic state, where optimal intervention barriers are computed by the solution to a fully implicit HJBQVI scheme (Ieda, 2013).
Market-making with discrete trading actions: combined limit order and market order controls in limit order books, where the optimal strategy is captured as the unique viscosity solution to a multi-dimensional HJBQVI (Meteykin, 24 Dec 2025).
Deterministic and stochastic zero-sum impulse games: two-player games with impulse moves by both parties, analyzed in both infinite-horizon (Asri et al., 2021) and finite-horizon (Cosso, 2012) settings.

Empirical evidence confirms that implicit schemes exhibit unconditional stability, rapid convergence of policy iteration, and discrete solutions matching theoretical optimal strategies, provided sufficient grid resolution.

7. Structural Innovations and Theoretical Advances

Several mathematical innovations underpin modern HJBQVI research:

The replacement of classical obstacle terms with first-order "differential obstacles" in certain new QVI formulations, facilitating tractable existence and uniqueness results under weaker assumptions (e.g., proportional costs) (Asri et al., 2021).
Identification and correction of issues in the classical viscosity supersolution definition, restoring the comparison principle required for uniqueness (Zhou et al., 2020).
Rigorous Barles–Souganidis convergence analysis for monotone implicit schemes in the presence of nonlocal obstacles (Azimzadeh et al., 2017).
Double-obstacle formulations for multi-agent and game-theoretic impulse control contexts (Cosso, 2012, Asri et al., 2021).

These theoretical developments provide a robust analytic and computational foundation for impulse control in high-dimensional and game-theoretic settings, broadening the class of stochastic and deterministic decision problems tractable via HJBQVI methodologies.