Papers
Topics
Authors
Recent
2000 character limit reached

Mean-Field Hamilton-Jacobi-Bellman Equation

Updated 27 November 2025
  • Mean-field HJB equations are partial differential equations characterizing the value function for stochastic control problems where dynamics depend on the state distribution.
  • They employ a dynamic programming principle on the Wasserstein space using Lions differentiability to tackle infinite-dimensional, non-local features.
  • Explicit linear-quadratic solutions demonstrate practical applications in finance and systemic risk, reducing the problem to coupled Riccati equations for optimal feedback controls.

A mean-field Hamilton-Jacobi-Bellman (HJB) equation is a partial differential equation characterizing the value function for stochastic optimal control problems in which the controlled system evolves according to McKean–Vlasov (mean-field) dynamics, i.e., where the drift, diffusion, or cost may depend on the distribution (law) of the state, and possibly also the control. This construct extends classical HJB theory to infinite-dimensional state spaces, typically the Wasserstein space of probability measures endowed with Lions differentiability. Mean-field HJB equations form the core of dynamic programming approaches for mean-field control and optimization, underpinning contemporary research in mean-field stochastic control, mean-field games, and large-population Markov decision processes.

1. Formulation of the Mean-Field Stochastic Control Problem

In the McKean–Vlasov framework, a controlled process (Xt)t[0,T](X_t)_{t\in[0,T]} follows the stochastic differential equation

dXt=b(t,Xt,PXt,αt)dt+σ(t,Xt,PXt,αt)dWt,dX_t = b(t, X_t, \mathbb{P}_{X_t}, \alpha_t)\,dt + \sigma(t, X_t, \mathbb{P}_{X_t}, \alpha_t)\,dW_t,

where bb and σ\sigma are Lipschitz functions in (x,μ,α)(x, \mu, \alpha), α\alpha is an admissible (progressively measurable, square-integrable) control with values in a compact set ARm\mathcal{A}\subset\mathbb{R}^m, and PXt\mathbb{P}_{X_t} is the law of XtX_t. The cost functional is

J(α)=E[0Tf(t,Xt,αt,P(Xt,αt))dt+g(XT,PXT)],J(\alpha) = \mathbb{E}\bigg[ \int_0^T f(t, X_t, \alpha_t, \mathbb{P}_{(X_t, \alpha_t)})\,dt + g(X_T, \mathbb{P}_{X_T}) \bigg],

with suitable growth and regularity constraints on ff and gg. The goal is to minimize J(α)J(\alpha) over admissible controls α\alpha.

For feedback controls of the form αt=υ(t,Xt,PXt)\alpha_t = \upsilon(t, X_t, \mathbb{P}_{X_t}) with υ\upsilon Lipschitz, the flow of marginals μt=PXt\mu_t = \mathbb{P}_{X_t} evolves deterministically in P2(Rd)\mathcal{P}_2(\mathbb{R}^d), the space of probability measures with finite second moment. The value function thus becomes v(t,μ)v(t,\mu), the minimum cost starting from time tt and marginal law μ\mu (Pham et al., 2015).

2. Dynamic Programming Principle and Bellman Equation on Wasserstein Space

Under this reformulation, the dynamic programming principle (DPP) holds in the space of probability measures: v(t,μ)=infυ{tTF(s,μs,υ(s,,μs))ds+G(μT)},v(t,\mu) = \inf_{\upsilon} \bigg\{ \int_t^{T} F(s, \mu_s, \upsilon(s, \cdot, \mu_s))\,ds + G(\mu_T) \bigg\}, where FF and GG are the mean-field extensions of the running and terminal cost: F(t,μ,υ)=Rdf(t,x,υ(x),(Id,υ)#μ)μ(dx),G(μ)=Rdg(x,μ)μ(dx).F(t, \mu, \upsilon) = \int_{\mathbb{R}^d} f(t, x, \upsilon(x), (\operatorname{Id}, \upsilon)_\#\mu)\,\mu(dx), \quad G(\mu) = \int_{\mathbb{R}^d} g(x, \mu)\,\mu(dx). The DPP takes a recursive form on [0,T]×P2(Rd)[0,T]\times \mathcal{P}_2(\mathbb{R}^d).

Key to the analytic machinery is the notion of differentiability with respect to probability measures, as formalized by Lions. For a function u:P2(Rd)Ru: \mathcal{P}_2(\mathbb{R}^d)\to\mathbb{R}, the "lift" to L2(Ω;Rd)L^2(\Omega;\mathbb{R}^d) is defined by U(X)=u(PX)U(X) = u(\mathbb{P}_X). If UU is Fréchet-differentiable, the Lions derivative μu(μ)(x)\partial_\mu u(\mu)(x) exists and forms the infinitesimal generator for Itô's calculus on the Wasserstein space (Pham et al., 2015).

Applying the extended Itô formula yields: du(t,μt)=tu(t,μt)dt+E[μu(t,μt)(Xt)bt]dt+12E[Tr(xμu(t,μt)(Xt)σtσt)]dt.d\,u(t, \mu_t) = \partial_t u(t,\mu_t)\,dt + \mathbb{E}\big[\partial_\mu u(t,\mu_t)(X_t)\cdot b_t \big]\,dt + \frac{1}{2} \mathbb{E}\big[ \operatorname{Tr}( \partial_x \partial_\mu u(t,\mu_t)(X_t)\, \sigma_t \sigma_t^\top )\big]\,dt. Plugging this chain rule, together with the DPP, leads to the mean-field HJB equation.

3. Mean-Field Hamilton-Jacobi-Bellman Equation: Structure and Interpretation

The resulting Bellman PDE on [0,T)×P2(Rd)[0,T)\times\mathcal{P}_2(\mathbb{R}^d) takes the form: tv(t,μ)+infυ{F(t,μ,υ)+Rdμv(t,μ)(x),b(t,x,μ,υ(x))μ(dx)+12RdTr(xμv(t,μ)(x)σσ(t,x,μ,υ(x)))μ(dx)}=0,-\partial_t v(t,\mu) + \inf_{\upsilon} \bigg\{ F(t,\mu,\upsilon) + \int_{\mathbb{R}^d} \langle \partial_\mu v(t,\mu)(x), b(t,x,\mu,\upsilon(x))\rangle\,\mu(dx) + \frac{1}{2} \int_{\mathbb{R}^d} \operatorname{Tr}\big( \partial_x \partial_\mu v(t,\mu)(x)\, \sigma \sigma^\top(t,x,\mu,\upsilon(x)) \big)\,\mu(dx) \bigg\} = 0,

v(T,μ)=G(μ).v(T,\mu) = G(\mu).

The Hamiltonian is given explicitly in terms of drift, diffusion, and cost, with infimum over Markov controls υ\upsilon. The appearance of Lions derivatives reflects the infinite-dimensional geometry of P2\mathcal{P}_2, making the equation genuinely non-local and nonlinear in distributional argument (Pham et al., 2015).

4. Solution Concepts: Classical, Verification, and Viscosity

Existence and uniqueness of solutions depend on regularity:

  • Classical solution and Verification: If wC1,2w\in C^{1,2} solves the Bellman equation and the infimum is achieved by a Lipschitz feedback υ\upsilon^*, then w=vw = v and the associated closed-loop control is optimal. The proof proceeds by applying Itô's formula to w(s,μs)w(s, \mu_s) and leveraging the PDE to dominate the cost functional, with equality achieved on the feedback minimizer (Pham et al., 2015).
  • Viscosity solutions: If smoothness fails, the equation is lifted to L2(Ω;Rd)L^2(\Omega;\mathbb{R}^d), and a notion of viscosity solution is constructed via test functions that themselves are lifts from P2\mathcal{P}_2. The value function vv is shown to satisfy the viscosity solution conditions, and a comparison principle holds for sub/supersolutions with suitable growth controls, implying uniqueness.

These results ensure the well-posedness of the mean-field Bellman equation in wide generality, even in the presence of measure dependence and degenerate diffusion.

5. Linear-Quadratic Explicit Solutions and Applications

In the linear-quadratic (LQ) case, bb and σ\sigma are affine in xx and μ\mu, and the costs are quadratic: b(t,x,μ,a)=b0(t)+B(t)x+B1(t)μˉ+C(t)a+C1(t)νˉ,σ(t,x,μ,a)=σ0(t)+D(t)x+D1(t)μˉ+F(t)a+F1(t)νˉ,b(t,x,\mu,a) = b_0(t) + B(t)x + B_1(t)\bar{\mu} + C(t)a + C_1(t)\bar{\nu}, \quad \sigma(t,x,\mu,a) = \sigma_0(t) + D(t)x + D_1(t)\bar{\mu} + F(t)a + F_1(t)\bar{\nu}, where μˉ=xμ(dx)\bar{\mu} = \int x\,\mu(dx), νˉ=aν(da)\bar{\nu} = \int a\,\nu(da). The value function admits a closed form as a quadratic function of μ\mu: v(t,μ)=Var(μ)(A(t))+μˉΛ(t)μˉ+y(t)μˉ+x(t),v(t,\mu) = \operatorname{Var}(\mu)(A(t)) + \bar{\mu}^\top \Lambda(t)\,\bar{\mu} + y(t)^\top\bar{\mu} + x(t), with A,Λ,y,xA, \Lambda, y, x evolving according to coupled Riccati and linear ODEs.

These explicit solutions undergird applications including:

  • Mean-variance portfolio selection, recovering classical optimal investment formulas.
  • Systemic risk in inter-bank models, where the optimal borrowing-lending rates are obtained as affine functions of deviation from the population mean (Pham et al., 2015).

6. Open-Loop vs. Closed-Loop Controls and Equivalence

Under open-loop controls, where policies need not be feedback in PXt\mathbb{P}_{X_t}, the DPP remains valid on P2\mathcal{P}_2 with a modified Hamiltonian. However, under mild integrability conditions, one can show that the infima coincide for open- and closed-loop formulations, so that the HJB equation and optimal values coincide in both settings. This equivalence is particularly robust in the LQ case (Pham et al., 2015).

7. Relation to Mean-Field Game Theory and Extensions

The mean-field HJB equation forms the optimality condition in mean-field type control, mean-field games, and certain large-system Markov decision frameworks. It connects to the mean-field game master equation, which describes the limit of Nash equilibria for large populations and dualizes with Fokker–Planck equations for the state law, as in the foundational theory developed by Lions and subsequent works (Pham et al., 2015, Bensoussan et al., 2014, Gast et al., 2010).

Extensions include:

  • Master equations coupling the value function with the law and individual state (mean-field “social optimization” and necessary conditions for ϵ\epsilon-person-by-person optimality) (Huang et al., 19 Aug 2025).
  • Infinite-dimensional PDEs arising in storage models or delayed systems, formulated in Hilbert or Banach spaces and linked analytically to the mean-field HJB structure (Bertucci et al., 2022, Fouque et al., 2018).
  • Weak/viscosity solutions in spaces of measures (e.g., via Fourier mode truncation) accommodating highly singular or non-convex data (Cecchin et al., 2022).

The rigorous paper of mean-field HJB equations continues to drive both mathematical theory and applications in stochastic control, finance, systemic risk, and large-scale engineered or physical systems.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Mean-Field Hamilton-Jacobi-Bellman Equation.