Papers
Topics
Authors
Recent
Search
2000 character limit reached

Density-Driven Optimal Control (D2OC)

Updated 23 November 2025
  • Density-Driven Optimal Control is a framework that lifts nonlinear, stochastic, and high-dimensional system dynamics into the space of probability densities using operator theory.
  • It reformulates optimal control as a convex density program with linear PDE constraints and quadratic-over-linear cost functions to ensure global optimality.
  • The method integrates finite-dimensional approximations and data-driven operator learning, enabling efficient and safe control synthesis across diverse applications.

Density-Driven Optimal Control (D2OC) is a class of optimal control methodologies that approach optimal policy synthesis by lifting nonlinear, stochastic, or mean-field system dynamics into the space of probability densities, rather than optimizing directly over trajectories or controls. By leveraging linear operator theory—chiefly the Perron-Frobenius (P-F) and Koopman operators—D2OC provides a principled, convex and often data-driven framework for solving controlled density evolution, including both pathwise and stationary objectives, safety constraints, and high-dimensional control synthesis, with algorithmic applicability to deterministic, stochastic, and hybrid dynamical systems.

1. Mathematical Foundations: System and Density Evolution

Density-Driven Optimal Control is fundamentally grounded in the evolution of probability densities governed by controlled dynamical systems. For a controlled diffusion process

dxt=f(xt)dt+G(xt)utdt+σ(xt)dWtd x_t = f(x_t)\,dt + G(x_t)u_t\,dt + \sigma(x_t)\,dW_t

the state density ρ(x,t)\rho(x,t) evolves according to the Fokker-Planck (forward Kolmogorov) partial differential equation (PDE): tρ+[fcρ]+12i,jxixj2((σσ)ij(x)ρ)=0\partial_t \rho + \nabla\cdot[-f_c \rho] + \frac{1}{2}\sum_{i,j} \partial_{x_ix_j}^2\left((\sigma \sigma^\top)_{ij}(x)\rho\right) = 0 where fc(x)=f(x)+G(x)k(x)f_c(x) = f(x) + G(x)k(x) for a feedback law u=k(x)u = k(x). The infinitesimal generator APFfcA_{PF}^{f_c} expresses this linear evolution on densities: tρ=APFfc[ρ]\partial_t \rho = A_{PF}^{f_c}[\rho] while the dual Koopman operator AKfcA_K^{f_c} acts on observables ϕ(x)\phi(x) as: tϕ=AKfc[ϕ]=fcϕ+12tr[σσ2ϕ]\partial_t \phi = A_K^{f_c}[\phi] = f_c\cdot \nabla \phi + \frac{1}{2}\operatorname{tr}[\sigma \sigma^\top \nabla^2 \phi] This operator-theoretic structure supports both the forward (density) and backward (value function) perspectives crucial to D2OC (Vaidya et al., 2022).

2. Convex Density Program: Infinite-Dimensional Formulation

D2OC reformulates the original stochastic optimal control problem (SOCP) as a convex optimization in the space of densities. For infinite-horizon running cost (x,u)=q(x)+ru2\ell(x, u) = q(x) + r|u|^2 and initial density h0h_0, the expected cost functional is: J=x(q(x,k(x))ρ(x))dx,ρ(x)=0[Ptch0](x)dtJ = \int_{x} \left(q(x, k(x)) \rho(x)\right)dx, \qquad \rho(x) = \int_0^\infty [P_t^c h_0](x) dt Defining the flux variable m(x)=ρ(x)k(x)m(x) = \rho(x)k(x), the stationary P-F (or Liouville) equation is: APFf[ρ]+[G(x)m(x)]=h0(x)A_{PF}^f[\rho] + \nabla \cdot [G(x)m(x)] = -h_0(x) The convex program becomes: minρ0,mq(x)ρ(x)+rm(x)2ρ(x)dxsubject to APFf[ρ]+(Gm)=h0\min_{\rho\geq 0, m} \int q(x)\rho(x) + r\frac{|m(x)|^2}{\rho(x)}dx \quad \text{subject to } A_{PF}^f[\rho] + \nabla \cdot (Gm) = -h_0 This quadratic-over-linear control cost is jointly convex in (ρ,m)(\rho, m) for ρ>0\rho > 0, and the PDE constraint is linear (Vaidya et al., 2022, Moyalan et al., 2022, Huang et al., 2020).

3. Data-Driven and Finite-Dimensional Approximation

Finite-dimensional approximation of the above infinite program is achieved by projecting densities and fluxes onto a dictionary of nonnegative basis functions (e.g. Gaussian radial basis or polynomials): ρ(x)Ψ(x)v,m(x)Ψ(x)w,h0(x)Ψ(x)h\rho(x) \approx \Psi(x)^\top v,\quad m(x) \approx \Psi(x)^\top w,\quad h_0(x) \approx \Psi(x)^\top h The infinitesimal P-F and Koopman operators are identified from data by methods such as extended dynamic mode decomposition (EDMD) or naturally structured DMD (NSDMD). Operator learning exploits time-series data from uncontrolled and controlled system simulations: Pexp(ΔtAPFf),APFf(PI)/ΔtP \approx \exp(\Delta t A_{PF}^f),\qquad A_{PF}^f \approx (P-I)/\Delta t The discrete convex quadratic program then reads: minv0,wdv+rwDwsubject to (Afv+AGw)=h\min_{v\geq 0, w} d^\top v + rw^\top D w \quad \text{subject to } -(A_f v + A_G w) = h with D=ΨΨdxD = \int \Psi\Psi^\top dx, d=q(x)Ψ(x)dxd = \int q(x)\Psi(x)dx. This is solvable by standard convex solvers (e.g. CVX), with the feedback law recovered as k(x)=[Ψ(x)w]/[Ψ(x)v]k^*(x) = [\Psi(x)^\top w^*]/[\Psi(x)^\top v^*] (Vaidya et al., 2022, Moyalan et al., 2022, Huang et al., 2020).

4. Duality: Koopman-HJB Formulation and Policy Iteration

The density-driven convex program is dual to a value-function approach posed in the space of observables. The corresponding Hamilton-Jacobi-Bellman (HJB) PDE, with generator AKf+GuA_K^{f+G u}, is: supu[AKf+GuV(x)+q(x)+ru2]=0\sup_{u}\left[ A_K^{f+G u}V(x) + q(x) + r|u|^2 \right] = 0 At the optimum u=12rGVu = -\frac{1}{2r} G^\top \nabla V, giving a closed-loop operator AKf+GkA_K^{f+G k}. Policy iteration proceeds by alternately solving:

  1. Policy evaluation: linear PDE for VkV_k under fixed kkk_k
  2. Policy improvement: update kk+1(x)=12rG(x)Vk(x)k_{k+1}(x) = -\frac{1}{2r}G(x)^\top \nabla V_k(x)

Koopman and P-F operators are adjoint, and under technical conditions, the optimal flux m=ρkm^* = \rho^* k^* and value function VV are related through Sen-Sen duality (Vaidya et al., 2022).

5. Extensions: Constraints, Safety, and Dual Density-Driven Structures

The density-driven approach natively accommodates state and input constraints via linear or convex restrictions in density space:

  • Hard state constraints: Xuρ(x)dx=0\int_{X_u}\rho(x) dx = 0 for obstacle avoidance
  • Traversability or safety budgets: B(x)ρ(x)dxγ\int B(x)\rho(x)dx \leq \gamma

Maximum-entropy variants add differential entropy regularization, producing Gaussian control policies and connecting D2OC to Schrödinger Bridges—entropy-optimal interpolating processes between marginals (Ito et al., 2022). Extensions to hybrid jump-diffusions, mean-field limits, and PDE-constrained swarm control rely on generalized Chapman-Kolmogorov or Fokker-Planck type equations, with first-order optimality conditions derived via infinite-dimensional Pontryagin or minimum principle frameworks (Bakshi et al., 2020, Sinigaglia et al., 2021).

6. Convergence, Global Optimality, and Practical Algorithms

Convexity of the lifted density-control cost ensures global optimality of the computed control law within the chosen function space. As the number of basis functions increases and operator approximations improve with data, the solution converges to the infinite-dimensional optimum (Vaidya et al., 2022, Moyalan et al., 2022). Standard convex programming complexity applies, dominated by quadratic program sizing. For high-dimensional or nonlinear systems, neural-network parameterizations and automatic-differentiation enable particle-based saddle-point solvers that bypass state-space gridding (Ma et al., 2023).

7. Applications and Numerical Demonstrations

D2OC frameworks have been validated on a range of systems:

Each application exploits the ability to encode distributional performance, uncertainty, hard constraints, and scalability via the convex density-driven lifting, yielding significant advantages over traditional trajectory-based or merely value-function-based approaches.


References

  • "Data-Driven Stochastic Optimal Control using Linear Transfer Operators" (Vaidya et al., 2022)
  • "Maximum entropy optimal density control of discrete-time linear systems and Schrödinger bridges" (Ito et al., 2022)
  • "Density control of large-scale particles swarm through PDE-constrained optimization" (Sinigaglia et al., 2021)
  • "Data-Driven Convex Approach to Off-road Navigation via Linear Transfer Operators" (Moyalan et al., 2022)
  • "Data-Driven Optimal Control via Linear Transfer Operators: A Convex Approach" (Moyalan et al., 2022)
  • "A Convex Approach to Data-driven Optimal Control via Perron-Frobenius and Koopman Operators" (Huang et al., 2020)
  • "Open-loop Deterministic Density Control of Marked Jump Diffusions" (Bakshi et al., 2020)
  • "Optimal Safe Controller Synthesis: A Density Function Approach" (Chen et al., 2019)
  • "High-dimensional Optimal Density Control with Wasserstein Metric Matching" (Ma et al., 2023)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Density-Driven Optimal Control (D2OC).