Papers
Topics
Authors
Recent
2000 character limit reached

Linear-Quadratic Mean-Field Control

Updated 5 December 2025
  • Linear-Quadratic Mean-Field Control is a framework that couples individual and collective dynamics in large populations using two coupled Riccati equations.
  • It employs forward-backward stochastic differential equations to derive explicit feedback laws separating fluctuations from mean-field components.
  • The framework underpins scalable and stable control policies applicable to systems like energy grids and multi-agent robotics.

A linear-quadratic mean-field control (LQ-MFC) framework is a rigorous optimal control methodology for large-population or interacting stochastic dynamical systems, where both the state evolution and performance criteria depend linearly on the system state and control, and quadratically on the state, control, or their mean. The framework generalizes classical LQ control to collective or "mean-field" settings by coupling the dynamics and cost through the system's empirical mean, leading to a rich and tractable analysis in both stochastic and deterministic formulations. Characteristically, the optimal control is characterized by a system of forward-backward stochastic differential equations (FBSDEs) for the state and adjoint processes, which is decoupled through two coupled matrix Riccati equations and yields a feedback control separating fluctuation and mean-field components. This structure has broad implications for stability, scalability, and implementation in high-dimensional systems.

1. Problem Formulation and Mean-Field Coupling

The LQ-MFC problem is defined on a probability space (Ω,F,{Ft},P)(\Omega,\mathcal{F},\{\mathcal{F}_t\},\mathbb{P}) with a Brownian motion W()W(\cdot). For t[0,T]t \in [0,T]:

  • The controlled mean-field SDE is

dX(t)=[A(t)X(t)+Aˉ(t)E[X(t)]+B(t)u(t)+Bˉ(t)E[u(t)]]dt +[C(t)X(t)+Cˉ(t)E[X(t)]+D(t)u(t)+Dˉ(t)E[u(t)]]dW(t), X(0)=xRn,\begin{aligned} dX(t) = & \bigl[ A(t) X(t) + \bar{A}(t) \mathbb{E}[X(t)] + B(t) u(t) + \bar{B}(t) \mathbb{E}[u(t)] \bigr] dt \ & + \bigl[ C(t) X(t) + \bar{C}(t) \mathbb{E}[X(t)] + D(t) u(t) + \bar{D}(t) \mathbb{E}[u(t)] \bigr] dW(t), \ & X(0)=x \in \mathbb{R}^n, \end{aligned}

where X(t)X(t) is the state, u(t)u(t) the control, and A,Aˉ,B,Bˉ,C,Cˉ,D,DˉA,\bar{A},B,\bar{B},C,\bar{C},D,\bar{D} are deterministic matrix-valued coefficients.

  • The cost functional is

J(u)=E[0T{X(t)TQ(t)X(t)+(E[X(t)])TQˉ(t)E[X(t)]+u(t)TR(t)u(t)+(E[u(t)])TRˉ(t)E[u(t)]}dt +X(T)TGX(T)+(E[X(T)])TGˉE[X(T)]],\begin{aligned} J(u) = \mathbb{E} \left[ \int_0^T \Big\{ X(t)^T Q(t) X(t) + \bigl(\mathbb{E}[X(t)]\bigr)^T \bar{Q}(t) \mathbb{E}[X(t)] + u(t)^T R(t) u(t) + \bigl(\mathbb{E}[u(t)]\bigr)^T \bar{R}(t) \mathbb{E}[u(t)] \Big\} dt\right. \ \left. + X(T)^T G X(T) + \bigl(\mathbb{E}[X(T)]\bigr)^T \bar{G} \mathbb{E}[X(T)] \right], \end{aligned}

with Q,Qˉ,R,Rˉ,G,GˉQ, \bar Q, R, \bar R, G, \bar G symmetric and, under well-posedness conditions, positive (semi-)definite (Yong, 2011).

Mean-field terms such as Aˉ,Bˉ,Cˉ,Dˉ,Qˉ,Rˉ,Gˉ\bar{A},\bar{B},\bar{C},\bar{D},\bar Q, \bar R, \bar G induce population-level interdependence, as the dynamics and costs depend not only on each agent’s state and control but also on their collective averages.

2. Pontryagin Principle, MF-FBSDEs, and the Optimality System

The necessary condition for optimality is derived via the stochastic maximum principle. Introducing adjoint processes (Y(t),Z(t))(Y(t), Z(t)), the Hamiltonian is

H(t,X,u,Y,Z)=Y,AX+AˉE[X]+Bu+BˉE[u] +Z,CX+CˉE[X]+Du+DˉE[u] +XTQX+(E[X])TQˉE[X]+uTRu+(E[u])TRˉE[u].\begin{aligned} \mathcal{H}(t,X,u,Y,Z) = & \left\langle Y, A X + \bar{A} \mathbb{E}[X] + B u + \bar{B} \mathbb{E}[u] \right\rangle \ & + \left\langle Z, C X + \bar{C} \mathbb{E}[X] + D u + \bar{D} \mathbb{E}[u] \right\rangle \ & + X^T Q X + (\mathbb{E}[X])^T \bar Q\, \mathbb{E}[X] + u^T R u + (\mathbb{E}[u])^T \bar R\, \mathbb{E}[u]. \end{aligned}

The corresponding adjoint mean-field BSDE reads

dY(t)={ATY(t)+AˉTE[Y(t)]+CTZ(t)+CˉTE[Z(t)]+QX(t)+QˉE[X(t)]}dt +Z(t)dW(t),\begin{aligned} dY(t) = & -\left\{ A^T Y(t) + \bar{A}^T \mathbb{E}[Y(t)] + C^T Z(t) + \bar{C}^T \mathbb{E}[Z(t)] + Q X(t) + \bar Q\, \mathbb{E}[X(t)] \right\} dt \ & + Z(t) dW(t), \end{aligned}

with terminal condition Y(T)=GX(T)+GˉE[X(T)]Y(T) = G X(T) + \bar G\, \mathbb{E}[X(T)].

The pointwise optimality (stationarity) condition specifies: Ru(t)+RˉE[u(t)]+BTY(t)+BˉTE[Y(t)]+DTZ(t)+DˉTE[Z(t)]=0.R u(t) + \bar R \mathbb{E}[u(t)] + B^T Y(t) + \bar B^T \mathbb{E}[Y(t)] + D^T Z(t) + \bar D^T \mathbb{E}[Z(t)] = 0.

The full optimality system thus comprises a coupled mean-field forward SDE (for the state) and backward SDE (for the adjoint), together with the stationarity condition (Yong, 2011).

3. Riccati Decoupling and Feedback Law Construction

Decoupling the mean-field FBSDE is achieved via an affine ansatz: Y(t)=P(t)[X(t)E[X(t)]]+Π(t)E[X(t)],Y(t) = P(t) [X(t) - \mathbb{E}[X(t)]] + \Pi(t) \mathbb{E}[X(t)],

Z(t)=P(t)[C(XE[X])+D(uE[u])]+Π(t)[CˉE[X]+DˉE[u]].Z(t) = P(t)\Bigl[C(X-\mathbb{E}[X]) + D(u - \mathbb{E}[u])\Bigr] + \Pi(t)\Bigl[\bar{C}\mathbb{E}[X] + \bar{D}\mathbb{E}[u]\Bigr].

The decoupling yields two deterministic matrix Riccati ODEs:

  • For fluctuations (“variance Riccati” PP):

dPdt+PA+ATP+CTPC+Q(PB+CTPD)R1(BTP+DTPC)=0;P(T)=G.\frac{dP}{dt} + P A + A^T P + C^T P C + Q - (P B + C^T P D) R^{-1} (B^T P + D^T P C) = 0; \quad P(T)=G.

  • For the mean-field component (“mean Riccati” Π\Pi):

dΠdt+Π(A+Aˉ)+(A+Aˉ)TΠ+Q+Qˉ+(C+Cˉ)TP(C+Cˉ) [Π(B+Bˉ)+(C+Cˉ)TP(D+Dˉ)][R+Rˉ+(D+Dˉ)TP(D+Dˉ)]1 ×[(B+Bˉ)TΠ+(D+Dˉ)TP(C+Cˉ)]=0;Π(T)=G+Gˉ.\begin{aligned} \frac{d\Pi}{dt} & + \Pi (A+\bar{A}) + (A+\bar{A})^T \Pi + Q + \bar Q + (C+\bar C)^T P (C+\bar C) \ & - \Bigl[ \Pi (B+\bar{B}) + (C+\bar{C})^T P (D+\bar{D}) \Bigr]\, \bigl[R+\bar{R}+(D+\bar D)^T P (D+\bar D)\bigr]^{-1} \ & \qquad\times\Bigl[(B+\bar{B})^T \Pi + (D+\bar D)^T P (C+\bar C)\Bigr] = 0; \quad \Pi(T) = G+\bar G. \end{aligned}

Under standard positive definiteness assumptions, these Riccati ODEs admit unique, symmetric solutions (Yong, 2011).

The resulting feedback representation for the optimal control is: u(t)=R1[BTP(t)+DTP(t)C(t)][X(t)E[X(t)]] [R+Rˉ+(D+Dˉ)TP(D+Dˉ)]1[(B+Bˉ)TΠ+(D+Dˉ)TP(C+Cˉ)]E[X(t)].\begin{aligned} u^*(t) = & -R^{-1} \bigl[ B^T P(t) + D^T P(t) C(t) \bigr] \bigl[ X(t) - \mathbb{E}[X(t)] \bigr] \ & -\bigl[R+\bar R + (D+\bar D)^T P (D+\bar D)\bigr]^{-1}\bigl[ (B+\bar B)^T \Pi + (D+\bar D)^T P (C+\bar C) \bigr]\, \mathbb{E}[X(t) ]. \end{aligned} This exhibits a decomposition into feedback gains for the deviation and for the mean, with the mean-field terms governing overall collective regulation and the fluctuation part ensuring stabilization of deviations.

4. Structural Features, Well-Posedness, and Special Cases

A defining characteristic of the LQ-MFC framework is the necessity of solving two coupled Riccati equations, instead of only one as in classical LQ problems. The variance Riccati governs the stabilization of stochastic deviations from the mean, while the mean Riccati regulates the evolution of the population mean.

  • When all mean-field coefficients vanish (Aˉ=Bˉ=Cˉ=Dˉ=Qˉ=Rˉ=Gˉ=0\bar A = \bar B = \bar C = \bar D = \bar Q = \bar R = \bar G = 0), the framework reduces to the standard LQ setting, with ΠP\Pi \equiv P and a single Riccati equation (Yong, 2011).
  • Existence and uniqueness of the solution rely on the uniform positive definiteness of R+Rˉ+(D+Dˉ)TP(D+Dˉ)R+\bar R+(D+\bar D)^T P(D+\bar D), the non-negativity of Q+QˉQ+\bar{Q}, and G+GˉG+\bar G, which ensure convexity and coercivity of the cost functional.

A table summarizing key elements:

Component Classical LQ LQ-MFC
State equation SDE, no mean-field SDE with mean and control mean-field coupling
Cost functional Quadratic in state/control Quadratic in state/control/means
Riccati equations One matrix ODE Two coupled matrix ODEs (variance and mean)
Feedback law Linear state feedback Linear feedback on deviation and mean
Well-posedness One positivity condition Stronger, two-fold positivity in Riccati solutions

5. Ergodic and Infinite-Horizon Structure

For infinite-horizon mean-field LQ control with constant coefficients, the finite-horizon Riccati ODEs converge exponentially fast to the unique stabilizing solutions of the corresponding algebraic Riccati equations (AREs). This allows for construction of steady-state, time-homogeneous feedback laws and explicit characterizations of long-term average costs (Bayraktar et al., 13 Feb 2025, Huang et al., 2012). The solution is structured by splitting the system into:

  • A fluctuation subsystem, controlled by the “variance” Riccati;
  • A mean subsystem, controlled by the “mean” Riccati.

In this regime, the turnpike property applies: optimal finite-horizon trajectories converge exponentially (except near the time boundaries) to the ergodic limit, which is the solution of the infinite-horizon feedback. This property justifies using stationary policies for long-run applications (Bayraktar et al., 13 Feb 2025).

6. Extensions and Applications

The LQ-MFC framework extends to several variants:

  • Indefinite quadratic costs, handled via relaxed compensator techniques and further algebraic analysis (Li et al., 2020).
  • Infinite-population and mean-field-team formulations, where decentralized controllers and social optima can be constructed and analyzed for scalability and robustness (Arabneydi et al., 2016, Wang et al., 2020).
  • Systems with heterogeneous agents, cluster-based structures, and distributed information networks, leading to block-diagonal and local-global Riccati structure (Liang et al., 2022).
  • Incorporation of risk constraints, leading to risk-adjusted Riccati equations and affine control laws insensitive to population size (Roudneshin et al., 2023).

Real-world applications arise in macroeconomic planning, networked control of multi-agent systems, distributed energy grids, and large-scale robotics, wherever systemic state or control averages are crucial performance determinants.

7. Theoretical and Computational Considerations

Solving the LQ-MFC problem involves integrating two coupled Riccati equations, with complexity independent of the agent population; thus the method is well-suited for large-scale systems. The resulting closed-loop laws are explicit and robust to both modeling details and information structure, enabling distributed implementation (Yong, 2011, Arabneydi et al., 2016). The theory also provides transparent sufficient conditions for existence, uniqueness, and stabilizability of the problem. Extensions to random coefficients, backward SDEs, and entropy-regularized reinforcement learning formulations further broaden the framework’s applicability (Xiong et al., 7 Jun 2024, Xiong et al., 1 Mar 2025, Frikha et al., 2023).

The linear-quadratic mean-field control framework thus forms a foundational methodology for scalable control of high-dimensional stochastic systems with nontrivial collective behavior, offering both technical depth and practical tractability.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Linear-Quadratic Mean-Field Control Framework.