Optimal Transport Frameworks

Updated 19 June 2026

Optimal transport frameworks are rigorous mathematical and computational systems designed for mapping one probability distribution to another with minimal cost, crucial in fields like machine learning and economics.
They incorporate classical and generalized formulations, leveraging techniques such as entropic regularization, Sinkhorn algorithms, and multiscale methods to improve efficiency and scalability.
Recent advances focus on dynamic, constrained, and robust variants that offer strong theoretical guarantees and enable practical applications in imaging, network design, and quantum systems.

Optimal transport (OT) frameworks constitute a fundamental set of mathematical, algorithmic, and computational tools for quantifying and optimizing the transformation of one probability distribution (or collection of resources) into another with respect to an explicit ground cost. OT frameworks span discrete and continuous, static and dynamic, convex and non-convex, unregularized and regularized problems, and their methodologies underpin a broad landscape of applications across statistics, machine learning, network design, computer vision, domain adaptation, economics, quantum information, and control. The following entry surveys the central classes of OT frameworks, emphasizing theoretical structure, algorithmic design, and domain-specific implications, drawing on recent advances and rigorous results.

1. Classical and Generalized Formulations

Optimal transport was originally posed by Monge as the problem of finding a deterministic mapping $T:X\rightarrow Y$ pushing a source measure $\mu$ to a target $\nu$ with minimum total cost $\int_X c(x,T(x))\,d\mu(x)$ . Kantorovich relaxed this to seek a coupling $\pi\in\Pi(\mu,\nu)$ , i.e., a joint probability measure with prescribed marginals, minimizing $\int_{X\times Y} c(x,y)\,d\pi(x,y)$ (Moradi, 8 Jan 2025, Peyré et al., 2018). This LP admits strong duality and a wealth of analytic structure.

Frameworks arise via extensions and generalizations:

Unbalanced OT: Relaxes fixed-mass constraints, penalizing the deviation of marginals via divergence terms. This allows for mass creation and destruction, as in the unbalanced or non-conservative OT where the feasible set includes non-normalized or mass-scaled couplings (Kováčová et al., 1 Oct 2025, Montesuma et al., 2023).
Semi-discrete and time-varying OT: Couples a discrete set (agents, points) with a time-evolving density via semi-discrete duality; leverages saddle flows on optimal plans and dual potentials for dynamic agent control (Napolitano et al., 29 Jan 2026).
Constrained OT: Incorporates additional linear or abstract constraints (e.g., elementwise prohibitions (Cang et al., 2022), martingale or path constraints (Ekren et al., 2016), or assignment structures (III et al., 2017)).
Nonlinear and composite OT: Allows for general convex or nonconvex objectives $f(P)$ over the polytope of couplings, including Gromov-Wasserstein, co-optimal transport, and regularized forms (Mishra et al., 2021).

The duality theory extends to Banach lattices and abstract convex cones, providing strong duality and characterizations for a wide variety of constraints and objective structures (Ekren et al., 2016).

2. Computational and Algorithmic Methodologies

The solution of OT problems at scale hinges on algorithmic innovations. Classical approaches such as the network simplex and auction algorithms directly solve (large-scale) linear programs, achieving strong guarantees for exact transport but with at least quadratic complexity in the size of supports (III et al., 2017, Peyré et al., 2018). Recent developments include:

Entropic Regularization and Sinkhorn Algorithms

Adding an entropy penalty yields strictly convex objectives and strictly positive solutions, enabling parallelizable iterative matrix-scaling via Sinkhorn–Knopp updates. The primal reads

$\min_{P\in U(a,b)} \langle C, P \rangle - \varepsilon H(P),\quad H(P) = -\sum_{i,j}P_{ij}(\log P_{ij}-1)$

with iterative updates $u \leftarrow a/(K v)$ , $v \leftarrow b/(K^T u)$ for $\mu$ 0. This achieves $\mu$ 1 scaling and can exploit GPU acceleration (Khamis et al., 2023, Moradi, 8 Jan 2025, Peyré et al., 2018).

Multiscale, Minibatch, and Slicing Approaches

Multiscale frameworks recursively coarsen problem structure, solving OT at increasingly fine resolutions with warm starts and restricted column generation, achieving near-linear runtime under low intrinsic dimension (Gerber et al., 2017). Sliced OT and min-sliced transport plan (min-STP) techniques further reduce computational cost by projecting distributions onto low-dimensional subspaces or learning parametric slicers (e.g., neural networks) for amortized or one-shot matching, with theoretical transferability guarantees under distributional drift (Liu et al., 24 Nov 2025, Khamis et al., 2023).

Data-Driven and Adaptive Mesh Methods

Adaptive mesh and mixture methods decompose the marginals and coupling into localized components, enabling closed-form solutions for subproblems and an assignment LP, with refinement and parallel barycenter computation (Chen et al., 2017).

Manifold Optimization

The coupling polytope admits a smooth Riemannian structure supporting first- and second-order methods (e.g., Riemannian trust-region, CG/gradient flows) directly on the interior, generalizing beyond entropic regularization (Mishra et al., 2021).

3. Dynamic and Time-Varying OT Frameworks

Dynamic OT generalizes the cost from static matchings to time-dependent flows, as in the Benamou–Brenier formulation which seeks a velocity field $\mu$ 2 and density $\mu$ 3 to minimize kinetic energy $\mu$ 4 under mass conservation, discretized and efficiently solved via convex splitting schemes (Douglas–Rachford, ADMM) (Papadakis et al., 2013). Recent work extends these principles to time-varying coverage, whereby Lagrangian agents adapt to evolving densities, with exponential tracking rates and explicit one-dimensional solutions based on semi-discrete Kantorovich duality (Napolitano et al., 29 Jan 2026).

Non-conservative OT further introduces mass change via a mass-change factor $\mu$ 5, with associated dual formulations and dynamic (Eulerian/Lagrangian) analogues to Benamou–Brenier (Kováčová et al., 1 Oct 2025).

4. Structured, Regularized, and Constrained OT Variants

The basic framework admits profound extensions:

Supervised OT permits hard elementwise constraints, formulated as $\mu$ 6 and solved by generalized Sinkhorn/Dykstra iteration (Cang et al., 2022). This supports applications demanding forbidden correspondences or locally blocked mass (e.g., color transfer, logistics under route closure).
Robust/latent/coupled OT: Low-rank factorizations, anchor-based plans, and mixture models yield robust mappings and interpretable correspondences for high-dimensional, noisy, or outlier-contaminated problems (Lin et al., 2020). Latent OT, specifically, parameterizes couplings via anchor-point chains, admits explicit sample complexity bounds, and demonstrates strong empirical resilience to data shift.
Gromov–Wasserstein and higher-order OT: Matrix and tensor-valued couplings yield frameworks for aligning relational data structures (e.g., graphs, point clouds) by loss functions on intra-domain distance discrepancies (Montesuma et al., 2023, Moradi, 8 Jan 2025).
Constrained and martingale OT: Abstract duality extends to Banach-lattice settings with linear or path-dependent constraints, yielding broad applicability to structured financial hedging, pathwise inequalities, or stochastic control (Ekren et al., 2016).

5. Theoretical Guarantees and Regularity

OT frameworks exhibit strong theoretical properties, including dual attainability, error and bias bounds, and sample complexity rates dependent on dimension and intrinsic geometry. Notable results include:

Explicit error bounds for approximate auction methods (primal gap within $\mu$ 7 of optimal) (III et al., 2017).
Multiscale OT: empirical objective error $\mu$ 8; convergence rates $\mu$ 9 for sliced and robust frameworks (Gerber et al., 2017, Liu et al., 24 Nov 2025).
Dynamic OT: exponential convergence rates for agent-barycenter error under primal-dual flows in the time-varying control context (Napolitano et al., 29 Jan 2026).
Folded OT: extension to convex sets $\nu$ 0 not representable as simplices via Choquet theory, with metric properties carrying over to the quantum and semiclassical domains (Borsoni, 1 Dec 2025).
Regularity theory (Ma–Trudinger–Wang condition): precise geometric and PDE-based criteria for the continuity and smoothness of Monge maps, with obstructions arising from the failure of curvature conditions in the cost geometry (Khan et al., 2022).

6. Applications and Domain-Specific Impact

OT frameworks have penetrated a vast spectrum of domains:

Machine learning and statistics: OT distances provide loss functions and metrics for domain adaptation, transfer learning, representation learning, clustering, generative modeling (WGANs, Sinkhorn GANs, VAEs), and fairness (Montesuma et al., 2023, Moradi, 8 Jan 2025, Lin et al., 2021, Salimans et al., 2018).
Imaging and computational anatomy: Multiscale OT enables large-scale comparison and regression on structures extracted from volumetric MRI, outperforming alternatives in predictive power (Gerber et al., 2017).
Distributed control and robotics: Time-varying OT frameworks yield principled multi-agent coverage strategies for environmental monitoring and sensor networks, with rigorous tracking guarantees in Wasserstein space (Napolitano et al., 29 Jan 2026).
Quantum and semiclassical systems: Folded Kantorovich costs underpin rigorous quantum–classical comparisons, separable quantum transport metrics, and semiclassical analysis (Borsoni, 1 Dec 2025).
Network design: Dynamic Lyapunov-based OT models generalize classical Physarum-inspired network formation to multi-commodity and loop-forming infrastructures (Lonardi et al., 2020).
Economics and logistics: Non-conservative OT models portfolio rebalancing and value-preserving asset transfers with explicit LP formulations (Kováčová et al., 1 Oct 2025).

7. Challenges, Trade-offs, and Future Directions

Despite the breadth of available frameworks, optimal transport remains computationally demanding for large-scale, high-dimensional problems (curse of dimensionality), requiring low-rank, sliced, or learned surrogates for efficient deployment. Regularized (entropic) and mini-batch variants introduce bias-variance trade-offs and demand careful parameter tuning. Open research directions include the development of rigorous guarantees for neural OT solvers, scalable algorithms for high-order and structure-rich domains (graphs, sequences), robust handling of constraints and prohibitions, and unified theory for quantum and classical couplings.

A unifying trend is the integration of OT frameworks with information geometry, variational flows, and statistical estimation, ensuring that advances in one area rapidly propagate throughout OT methodologies (Khan et al., 2022).

References

(Moradi, 8 Jan 2025) A Survey on Algorithmic Developments in Optimal Transport Problem with Applications
(III et al., 2017) General auction method for real-valued optimal transport
(Gerber et al., 2017) Multiscale Strategies for Computing Optimal Transport
(Napolitano et al., 29 Jan 2026) Optimal Transport for Time-Varying Multi-Agent Coverage Control
(Lin et al., 2020) Making transport more robust and interpretable by moving data through a small number of anchor points
(Liu et al., 24 Nov 2025) Efficient Transferable Optimal Transport via Min-Sliced Transport Plans
(Kováčová et al., 1 Oct 2025) Non-conservative optimal transport
(Cang et al., 2022) Supervised Optimal Transport
(Borsoni, 1 Dec 2025) Folded optimal transport and its application to separable quantum optimal transport
(Ekren et al., 2016) Constrained Optimal Transport
(Mishra et al., 2021) Manifold optimization for non-linear optimal transport problems
(Salimans et al., 2018) Improving GANs Using Optimal Transport
(Montesuma et al., 2023) Recent Advances in Optimal Transport for Machine Learning
(Khamis et al., 2023) Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey
(Papadakis et al., 2013) Optimal Transport with Proximal Splitting
(Chen et al., 2017) A data-driven linear-programming methodology for optimal transport
(Khan et al., 2022) When Optimal Transport Meets Information Geometry
(Lonardi et al., 2020) Designing optimal networks for multi-commodity transport problem
(Lin et al., 2021) Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport