Moment Constrained Optimal Transport

Updated 28 August 2025

Moment Constrained Optimal Transport is a framework that integrates global moment conditions (e.g., expected values, higher-order moments) into optimal transport problems.
It employs variational principles and dual formulations to achieve unique, log-concave solutions and robust convergence via displacement convexity.
Numerical methods leveraging sparsity and entropic regularization make MCOT practical for applications in physics, control, and machine learning.

Moment Constrained Optimal Transport (MCOT) is a class of optimal transport (OT) problems in which the transported measures or plans must satisfy prescribed moment constraints. These constraints impose global conditions, such as matching expected values or higher-order moments, on the transported distributions beyond the classical marginal constraints. MCOT has emerged as a flexible modeling tool across probability, analysis, statistical inference, machine learning, distributed control, and numerical optimization, enabling the integration of complex global requirements into transport-based formulations.

1. Variational Foundations and Moment Measure Characterization

At the core of MCOT is the variational approach combining classical OT cost functionals with additional constraints that enforce the matching of moments or more general feature expectations. In "Dealing with moment measures via entropy and optimal transport" (Santambrogio, 2015), the fundamental connection is made by considering the minimization over absolutely continuous probability densities $p$ : $\min_{p \in \mathcal{P}_1(\mathbb{R}^d)} \left\{ \mathcal{E}(p) + \mathcal{T}(p, \mu) \right\}$ where $\mathcal{E}(p) = \int p(x) \ln p(x) dx$ is the entropy and $\mathcal{T}(p, \mu)$ is a transport cost, defined as the maximal correlation between $p$ and a measure $\mu$ linked to the target moment structure. Moment constraints (e.g., zero barycenter, non-concentration on hyperplanes) are imposed by restricting the search space, ensuring that minimizers correspond to measures of the form $p(x) = c \exp(-u(x))$ with $u$ convex and essentially continuous.

The optimizer in this setup can be interpreted as producing a log-concave density whose associated push-forward by the gradient of the potential $u$ matches the target measure's moment structure. Displacement convexity of both entropy and transport cost is essential, ensuring strict convexity along Wasserstein geodesics and hence uniqueness of the minimizer, up to translation. Semicontinuity results guarantee that the variational problem is well-posed and minimizers exist under mild integrability conditions.

This variational principle both generalizes and unifies earlier moment-measure problems (e.g., those studied by Cordero-Erausquin and Klartag) and sets the theoretical foundation for MCOT in broader contexts.

2. Abstract Duality and Constraint Integration

A general abstract formulation of MCOT is provided in "Constrained Optimal Transport" (Ekren et al., 2016), which embeds both OT and moment constraints into an order-theoretic and convex-analytic framework. The problem is cast in a Banach lattice $\mathcal{X}$ with an order unit $e$ , defining a natural norm and order structure: $\|f\|_{\mathcal{X}} = \inf \{ c \in \mathbb{R}: -c e \leq f \leq c e \}$ The primal problem becomes the maximization of $n(a)$ over convex subsets of the positive unit sphere in the dual $X''$ , representing constraint-satisfying measures. Moment constraints enter by requiring that $n(f_i) = \text{given}$ for prescribed test functions $\{f_i\}$ . The dual problem is formulated on $X''$ with elements of the form $c e + h$ , encompassing additional "hedging" directions induced by the constraints.

This framework yields a robust duality theory: moment constraints enlarge the set of dual elements by additional functionals that "annihilate" the constraint space, and the primal–dual equality extends. Regular convexity, separation, and the Krein–Šmulian property provide general conditions for dual attainment in the presence of constraint sets. In applications such as martingale OT or financial hedging, this abstract analysis ensures that MCOT problems admit tractable dual formulations and strong existence/uniqueness results.

3. Numerical Methods and Discrete Representation

Numerical methods for MCOT leverage the finite-dimensional nature of moment constraints and the resulting structural sparsity. In the multi-marginal setting, as explored in "Approximation of Optimal Transport problems with marginal moments constraints" (Alfonsi et al., 2019) and "Constrained overdamped Langevin dynamics for symmetric multimarginal optimal transportation" (Alfonsi et al., 2021), the classic marginal constraints are replaced by finite sets of moment constraints, using Tchakaloff's theorem to guarantee that optimal plans are supported on a sparse set of points.

Mathematically, for $N$ constraints and $M$ marginals, the MCOT can be discretized: $\min_{\pi} \int c(x_1, \ldots, x_M) d\pi(x_1, \ldots, x_M)$ subject to

$\sum_{k=1}^K w_k \varphi_n(X^k_m) = \mu_n \quad \text{for } n = 1, \ldots, N, \ \forall m,$

where $\pi$ is now a finitely supported measure over weighted Dirac masses. Optimization uses stochastic (overdamped) Langevin dynamics projected onto the constraint set via Newton's method, ensuring all moments are met at each iteration.

This representation confers two advantages:

The number of support points scales linearly with the number of moment constraints, not exponentially with the system size.
All local minimizers are global, due to the underlying sparsity.

Sparsity allows for bypassing the curse of dimensionality, making MCOT practical for electronic structure problems in density functional theory (DFT) and other high-dimensional physical models.

4. Regularized and Algorithmic Approaches

Regularization and algorithmic strategies play a critical role in scalable MCOT. Entropic regularization, as introduced in "Moment Constrained Optimal Transport for Control Applications" (Corre et al., 2022), smooths the optimization landscape by penalizing deviation from independence via a relative entropy term: $\langle \gamma, c \rangle + \varepsilon D(\gamma\| \mu_1 \otimes \mu_2)$ with $\gamma$ constrained to couplings between the fixed marginal $\mu_1$ and a "moment class" of candidate laws $\mu_2$ : $\mathcal{M}(f, r) = \left\{ \mu : \langle \mu, f_i \rangle = r_i, \ i=1,\ldots, M \right\}$ Dual variables corresponding to the moment constraints enter as Lagrange multipliers in the dual functional, substituting part of the classic Kantorovich potentials. The resulting dual is lower-dimensional and efficiently optimizable, e.g., via Sinkhorn-adapted iterative scaling and stochastic approximation.

Dedicated proximal splitting and operator-based methods have been developed for constrained dynamic (time-dependent) MCOT problems, such as "Constrained Mass Optimal Transport" (Kerrache et al., 2022) and "Fundamental diagram constrained dynamic optimal transport via proximal splitting methods" (Dong et al., 28 Jul 2025). Augmented Lagrangian and primal-dual methods enable enforcement of linear and nonlinear moment constraints, including those induced by physical system dynamics (e.g., conservation laws, traffic congestion). These algorithms are rigorously analyzed for convergence, even under high-dimensional or nonconvex settings.

5. Applications: Physics, Control, and Data Sciences

MCOT has broad applicability across applied and theoretical domains:

Density Functional Theory and Strongly Correlated Electrons: The MCOT approximation enables physically consistent and computationally efficient solutions to multi-marginal Coulomb cost problems in electronic structure (Alfonsi et al., 2021). By constraining moments reflecting physical observables (e.g., charge, center-of-mass), MCOT captures essential quantum phenomena without resorting to grid-based discretization.
Distributed Control of TCLs: In "Moment Constrained Optimal Transport for Thermostatically Controlled Loads" (Corre et al., 27 Aug 2025), MCOT is used for large-scale demand response in electric grid management. Moment constraints encode aggregate power consumption and ramping requirements, while physical trajectories are governed by ODEs embedded in feasibility constraints. The framework facilitates tractable Monte Carlo gradient computations and supports online model predictive control.
Traffic Flow and Congestion Modeling: Imposing nonlinear, pointwise moment constraints based on the fundamental diagram of traffic theory ensures flow adheres to realistic capacity and congestion bounds, as in (Dong et al., 28 Jul 2025). MCOT yields congestion-aware transport plans exhibiting spreading, rerouting, and realistic transient behavior.
Fairness in Machine Learning and Partial Identification: In statistical models with missing or incomplete data, MCOT offers a fully characterized (via OT cost inequalities) parameter identification set, as shown in "Partial Identification in Moment Models with Incomplete Data via Optimal Transport" (Fan et al., 20 Mar 2025). In algorithmic fairness, MCOT-based support-function calculations yield tight bounds on disparate impact and true positive rate disparities under moment-linked constraints.
Signal Processing and Compression: Within the framework of "Constrained Gaussian Wasserstein Optimal Transport with Commutative Covariance Matrices" (Chen et al., 5 Mar 2025), MCOT enables explicit reverse waterfilling strategies for resource-constrained transformation between distributions, with applications in lossy compression, dimensionality reduction, and channel-constrained coding.

6. Computational Trade-offs and Theoretical Guarantees

MCOT exhibits a favorable computational–statistical trade-off by leveraging the structure induced by moment constraints:

The restriction to a finite set of constraints, as opposed to the full set of marginals, drastically reduces the effective complexity, enabling the use of sparse discrete representations.
Entropic and KL regularization terms not only ensure computational tractability but also guarantee solution uniqueness, smoothness of the optimization landscape, and stability under noisy data and model uncertainty.
Moment constraints can be integrated in abstract dual frameworks, extending strong duality and Fenchel–Moreau convex analysis to new MCOT variants (Ekren et al., 2016).
Semicontinuity and displacement convexity underpin existence and (up to symmetry) uniqueness.

For settings where only moment information is available, recent techniques based on the sums-of-squares (SoS) hierarchy allow the OT problem to be reformulated at the level of moments and solved via semidefinite programming, with provable convergence as the relaxation order increases (Mula et al., 2022). This opens MCOT to high-dimensional applications where only moment sequences—not explicit measures—are available.

7. Outlook and Open Problems

Active research directions at the MCOT interface include:

Extending the framework to handle dual-sided moment constraints and mixed marginal–moment constraints (Corre et al., 2022).
Developing scalable algorithms for large-scale systems and streaming data, with parallelized operator splitting, adaptive importance sampling, and sample-efficient gradient estimation (Corre et al., 27 Aug 2025).
Integrating MCOT with robust and adversarial OT paradigms, as well as with statistical inference via empirical Bayes and partial identification (Jaffe et al., 11 Jun 2025, Fan et al., 20 Mar 2025).
Theoretical analysis of relaxation gaps, rates of convergence for finite-moment approximations, and the geometry of the feasible set under non-polynomial or nonlinear moment constraints.
Applications in emerging domains such as energy markets, biology, high-dimensional control, and machine learning under distributional shifts.

In summary, MCOT unifies and extends the classical optimal transport framework to accommodate a diverse range of scientific and engineering constraints by encoding global properties into transport problems through moment conditions. This class of problems not only extends the theoretical boundaries of optimal transport but also provides robust, scalable methodologies for practical applications across computation, control, and inference.