Mean-Field Control Problem

Updated 19 September 2025

Mean-field control problems are defined by stochastic dynamics whose evolution and cost depend explicitly on the distribution of states and controls.
They combine probabilistic, analytic, and computational methods, such as stochastic maximum principles, Bellman equations in Wasserstein space, and deep learning approximations.
These frameworks capture the limits of large multi-agent systems, enabling rigorous convergence analyses and scalable solutions in finance, engineering, and statistical physics.

A mean-field control problem encompasses the optimal control of stochastic dynamical systems whose evolution and performance criteria depend explicitly on the distributional law of the state and, in many cases, the law of the control. This class subsumes McKean–Vlasov or distribution-dependent stochastic control frameworks and is motivated by systems with many interacting agents, where the influence of the population aggregate or average participates directly in both the state evolution and the objectives. This structure yields intrinsic mathematical challenges, particularly regarding time inconsistency, infinite-dimensional state spaces (measures), partial information scenarios, and difficulties in applying classical control principles. The field advances through the synthesis of probabilistic (e.g., stochastic maximum principle, filtering, mean-field FBSDEs), analytic (e.g., Bellman equations on Wasserstein spaces, viscosity solutions), and computational (e.g., deep learning, propagation of chaos, particle methods) methodologies, with clear applications in finance, economics, engineering, and statistical physics.

1. Mathematical Formulation of Mean-Field Control Problems

A mean-field control problem is typically characterized by controlled stochastic dynamics in which the coefficients depend functionally on the law of the state and possibly the control, often written as: $dX_t = b\bigl(t, X_t, \mathcal{L}(X_t), \mathcal{L}(X_t, \alpha_t), \alpha_t\bigr)dt + \sigma\bigl(t, X_t, \mathcal{L}(X_t), \mathcal{L}(X_t, \alpha_t), \alpha_t\bigr) dW_t + \text{(possibly) common noise},$ with

$J(\alpha) = \mathbb{E}\Biggl[\int_0^T L\bigl(t, X_t, \mathcal{L}(X_t), \mathcal{L}(X_t, \alpha_t), \alpha_t\bigr)dt + g\bigl(X_T, \mathcal{L}(X_T)\bigr)\Biggr]$

to be minimized (or maximized) over admissible controls $\alpha$ . The joint law dependency allows the modeling of "extended" mean-field effects where both state and control distributions interact in the dynamics and cost (Djete, 2020).

Specializations include:

State law dependence only (standard McKean–Vlasov SDEs, e.g., (Pham et al., 2015)), or
Joint state–control law dependence (extended mean-field control, (Djete, 2020)).

The control process may be required to be progressively measurable with respect to filtrations generated by the underlying Brownian motion(s), by the observation process in partially observed frameworks, or by common noise (see (Bouchard et al., 18 Sep 2025)).

Further, dynamics with only drift control, control in both drift and diffusion (Bahlali et al., 2017), or infinite-dimensional (SPDE) settings (Tang et al., 2016) have been studied.

2. Dynamic Programming and the Bellman Equation in the Wasserstein Space

Mean-field control problems induce a natural infinite-dimensional state space—the Wasserstein space $𝒫_2(\mathbb{R}^d)$ of probability measures with finite second moment. The dynamic programming principle (DPP) asserts that the value function $v(t, \mu)$ , with $\mu$ the marginal law at time $t$ , satisfies a nonlinear Hamilton–Jacobi–Bellman equation: $- \partial_t v(t, \mu) + \inf_{\tilde{\alpha}} \{ f(t, \mu, \tilde{\alpha}) + \langle \mathscr{L}^{(\tilde{\alpha})} v(t, \mu), \mu\rangle \} = 0, \quad v(T, \mu) = g(\mu)$ where $\mathscr{L}^{(\tilde{\alpha})}$ denotes a "measure generator" involving the (Lions) derivative of $v$ with respect to the measure variable (Pham et al., 2015, Crescenzo et al., 26 Jul 2024).

For extended mean-field control (dependence on the joint law), the Bellman equation operates on vector-valued measure trajectories, and the associated controlled Fokker–Planck equation becomes a key analytical object (Djete, 2020).

The infinite-dimensionality and the nonlocal nature of the dependence demand advanced calculus on $𝒫_2(\mathbb{R}^d)$ , especially for rigorous notions of (viscosity) solutions and chain rules (Itô formula for measure flows, (Crescenzo et al., 26 Jul 2024)).

3. Pontryagin Maximum Principle, Adjoint Equations, and Filtering

Due to the lack of Markovian structure, classical dynamic programming does not always apply, especially in the presence of partial observation or path-dependence (Wang et al., 2015, Buckdahn et al., 2017). In such contexts, the stochastic maximum principle is a central tool:

The adjoint process solves backward (and in mean-field settings, often mean-field backward) stochastic differential equations (BSDEs).
In partially observed setups, the adjoint equations incorporate conditional expectations with respect to the observation filtration and lead to coupled forward–backward SDEs, often with innovative decomposition techniques to achieve separation between the estimation and control (Wang et al., 2015, Tang et al., 2016).

For general mean-field systems and models with model uncertainty, the maximum principle may require operator-valued or measure-differentiated adjoint equations (Agram et al., 2016), including operator-valued BSDEs.

When both drift and diffusion are controlled—a setting that necessitates careful formulation—the relaxed control must be defined via an orthogonal martingale measure to properly capture the stochasticity contributed by the control in the diffusion term (Bahlali et al., 2017).

4. Convergence, Approximations, and Propagation of Chaos

A pivotal role of mean-field control theory lies in its function as a rigorous limit of large, centrally-controlled multi-agent systems.

As the number of agents $N \to \infty$ , the empirical measure of agent states converges (propagation of chaos) to the solution of the McKean–Vlasov SDE, and the finite $N$ control problem converges to the mean-field problem (Djete, 2020).
Quantitative weak convergence rates for the value function and optimal controls can be obtained via Wasserstein distances and (when applicable) backward SDE characterizations (Bouchard et al., 18 Sep 2025).
The equivalence between strong, weak, and relaxed (measure-valued) formulations is established in general, supporting both theoretical and numerical approximation approaches (Bouchard et al., 18 Sep 2025, Bahlali et al., 2017).

Discretized (e.g., Markov decision process) or continuous-time (FBSDE, PDE) approximations, as well as particle-based computational methods and training of feedback controllers, are central applications (Cui et al., 2021).

5. Partial Observation and Filtering

Partial observation brings unique technical challenges:

The observed process is a noisy, possibly control-influenced function of the true (hidden) state, requiring the optimal control process to be adapted to the observation filtration (Wang et al., 2015, Tang et al., 2016, Buckdahn et al., 2017).
The conditional law of the state, given the observation process, is both an argument of the coefficients and a quantity that must be estimated online (nonlinear filtering problem).
The circularity of mean-field dependence and partial observation often breaks the classical separation principle, necessitating analysis via backward separation methods, decomposition techniques, and coupled forward–backward filters (Wang et al., 2015, Buckdahn et al., 2017).
Such settings are intrinsically time-inconsistent, and dynamic programming principles must be reformulated in the infinite-dimensional law space (Buckdahn et al., 2017).

Applications include asset-liability management with recursive utility, systemic risk, and engineering systems with distributed sensors (Wang et al., 2015).

6. Extensions, Robust and Selective Control, and Learning

Advanced variations of the mean-field control problem have been formulated to address additional structural complexities:

Model uncertainty (ambiguity in the law of the state), using a two-player game framework with measure-valued and classical controls (Agram et al., 2016).
Selective control and transient leadership: By incorporating activation functions and population labels that evolve via Markovian jump processes, selective control strategies act on dynamically-identified subpopulations (Albi et al., 2021).
Robust $H_2/H_\infty$ control: The synthesis of mixed $H_2$ / $H_\infty$ optimal controllers for mean-field systems with affine terms is achieved via mean-field stochastic bounded real lemmas, Riccati equations, and forward–backward stochastic systems (Fang et al., 26 Jul 2025).
Mean-field Markov decision processes: Existence of optimal (stationary) policies, average reward formulations, and the use of Markov Chain Monte Carlo (MCMC) for static measure-optimality in special subclasses (Bäuerle, 2021).
Data-driven and deep learning approaches: Deterministic score-based neural network algorithms bypass FBSDE sampling, allowing for tractable, scalable algorithmic solutions for high-dimensional mean-field control PDEs (Zhou et al., 17 Jan 2024), with explicit error guarantees and statistical estimation via linear function approximations (Bayraktar et al., 2 Aug 2024).

7. Open Problems and Current Research Directions

Open questions and active research threads include:

General fully nonlinear, possibly path-dependent, mean-field control master equations, and their regularity and well-posedness (Bouchard et al., 18 Sep 2025).
Viscosity theory for Bellman equations in $L^2$ -types of Wasserstein-valued function spaces (Crescenzo et al., 26 Jul 2024).
New randomisation techniques for mean-field control with common noise based on Poisson point process intensities and associated BSDE/BSVI representations of the value (Denkert et al., 30 Dec 2024).
Computation of robust and partially observed controls in high-dimensional or infinite-population limits, including learning-based, MCMC, and particle system methods (Cui et al., 2021, Zhou et al., 17 Jan 2024, Bayraktar et al., 2 Aug 2024).
Systematic quantification of the effect of approximation (model mismatch, population finiteness, learning error) on control performance and policy convergence (Bouchard et al., 18 Sep 2025, Bayraktar et al., 2 Aug 2024).
Integration of mean-field theories in heterogeneous and non-exchangeable agent populations (Crescenzo et al., 26 Jul 2024).

Recent developments emphasize the convergence of theoretical regularity/stability (Wasserstein continuity, BSDE representations, Γ-convergence) with computational scalability, ensuring that mean-field control solutions are not only theoretically optimal but also implementable in real-world many-agent systems subject to practical information and model constraints.