Weak Optimal Transport Overview
- Weak optimal transport is a generalized framework where cost functions depend nonlinearly on conditional probability distributions.
- It establishes strong duality with well-defined primal and dual formulations and ensures optimal plan stability through cyclical monotonicity.
- Applications span economics, finance, and data science, with computational methods including mirror descent and neural approximations.
Weak optimal transport (WOT) is a generalization of the classical Monge–Kantorovich optimal transport framework, where the transport cost between a source point and the target can depend nonlinearly or even nonlocally on the conditional law of the coupling. This broad variational framework, introduced by Gozlan, Roberto, Samson, Tetali, and further developed by numerous others, unifies and extends classical OT, barycentric transport, martingale and entropic optimal transport, and provides new tools and perspectives for analysis, computation, economics, and probability.
1. Formal Framework and Problem Statement
Let be Polish spaces, , with , and the set of couplings with marginals . Each coupling admits a disintegration .
The weak optimal transport problem is defined for a measurable cost function , convex and lower semicontinuous (l.s.c.) in the second argument (for the weak or Wasserstein topology). The primal problem is:
Classical OT corresponds to .
Duality
The dual problem involves pairs called admissible if for all and , :
The dual value is:
For fixed , one defines
so that
Key Assumptions
- Lower boundedness: There exist , such that .
- Growth: There exist measurable and a convex, increasing, super-coercive function such that .
- Truncation continuity: If , .
Fundamental Theorem
Under these conditions:
- Primal attainment: the infimum is attained.
- Strong duality: .
- Under suitable growth/truncation continuity, dual attainment also holds.
- Complementary slackness: for a primal optimizer and dual optimizer , -almost surely (Beiglböck et al., 27 Jan 2025).
2. Principal Examples and Recoveries
Barycentric and Convex Costs
If with convex and , e.g., (barycentric quadratic cost):
where is the convex order.
The dual becomes:
Complementary slackness yields a barycentric map characterized by subgradient conditions (Beiglböck et al., 27 Jan 2025, Cazelles et al., 2021, Guo et al., 26 Nov 2025).
Entropic and Martingale OT
For
the weak-OT becomes entropic OT.
If additionally unless , one imposes a martingale condition, leading to weak martingale optimal transport (WMOT). Duality and structural results extend to the entropic-martingale context (Beiglböck et al., 27 Jan 2025, Chung et al., 2021, Carlier et al., 20 Nov 2025).
Hybrid Problems
Mixing barycentric and martingale/entropic constraints or costs yields continuous families interpolating between classical OT, martingale OT, and entropic OT, all captured within the WOT framework (Beiglböck et al., 27 Jan 2025, Guo et al., 26 Nov 2025).
3. Cyclical Monotonicity and Structural Optimality
Optimal weak transport plans are characterized by a form of cyclical monotonicity. A coupling is -monotone if for finite families , with competitor measures , :
Necessity and sufficiency of this condition (under extra regularity such as being Lipschitz in the measure variable) provide a direct generalization of classical cyclical monotonicity (Veraguas et al., 2018, Backhoff-Veraguas et al., 2019). This also underpins the stability theory: optimal plans are stable under perturbations of marginals or cost, given the adapted topology (which metrizes joint weak convergence of marginals and conditional laws) (Backhoff-Veraguas et al., 2019, Beiglböck et al., 2021).
4. Dynamic, Martingale, and PDE Connections
WOT admits a dynamic (PDE) characterization generalizing the Benamou–Brenier formula:
- The static weak transport problem is equivalent to a dynamic minimization over curves solving a generalized Fokker–Planck equation with (possibly measure-valued) diffusion tensor and a convex cost–integration (Bulanyi, 2023).
- Barycentric WOT can be described dynamically using drift–diffusion SDEs, with cost determined by the drift term, and further extended to martingale settings where the drift vanishes and the cost penalizes only covariance (Guo et al., 26 Nov 2025).
This establishes equivalence between static (coupling) and dynamic (PDE/SDE) perspectives for broad classes of convex costs.
5. Computational Methods and Algorithms
Efficient computation of WOT is challenging due to nonlinearity and complexity of the transport constraints.
- Mirror descent methods: For barycentric and unnormalized-kernel variants (WOTUK), primal and dual variants of mirror descent with entropy mirrors (KL divergence) and Sinkhorn projection are provably convergent and scalable (Paty et al., 2022).
- Neural approaches: Neural parameterizations of stochastic transport maps can approximate any WOT plan and can be optimized via a max–min (saddle-point) objective; this framework accommodates high-dimensional, nonlinear, and stochastic transport settings (Korotin et al., 2022).
These algorithms have been validated in economics (matching models), machine learning (distributional alignment, barycenters), and vision (image translation).
6. Weak Barycenters and Generalizations
Weak barycenters, defined via minimization of sums of WOT costs over a family of laws, generalize Wasserstein barycenters. Characterization and computation exploit the structure of convex ordering:
- Existence: Tightness and lower semicontinuity arguments guarantee minimizers under moment conditions (Cazelles et al., 2021).
- Characterization: Weak barycenters extract common geometric/latent information and have robustness advantages compared to classical barycenters.
- Algorithms: Deterministic (fixed-point), stochastic (streaming), and optimization (proximal gradient) methods are available.
Open problems concern uniqueness (especially in higher dimensions), stability, and geometric properties.
7. Applications and Extensions
- Economics: WOT captures nonlinear aggregation in matching models, labor assignment, and production economics, providing structural insights and richer matching patterns than OT (Paty et al., 2022).
- Finance: WMOT models are fundamental for robust pricing under martingale constraints, with applications including the robust superhedging of options and VIX futures, and with proven stability under distributional uncertainty (Beiglböck et al., 2021).
- Information Theory: Rate-distortion functions, Shannon bounds, and connections to the Schrödinger bridge are realized within the WOT setting (Zou et al., 16 Jan 2025).
- Risk Measures: Convex risk measures with WOT penalties yield primal and dual representations, with computational schemes based on variational and neural optimization (Kupper et al., 2023).
- Metric Geometry and Analysis: Extensions to barycentric costs, entropic regularizations, and transport with moment constraints expand the landscape of metric and probabilistic geometry (Carlier et al., 20 Nov 2025, Chung et al., 2021, Chung et al., 2019).
Table: Core Weak OT Paradigms
| Cost Formulation | Characteristic Constraint | Classical Example |
|---|---|---|
| Linear (classical OT) | Wasserstein distance | |
| Barycentric (convex order) | Brenier–Strassen | |
| Entropic/Schrödinger regularization | Entropic OT | |
| Martingale constraint () | Mean-preserving (martingale OT) | Martingale couplings |
This unified convex-analytic perspective recovers and extends key foundational results of optimal transport—duality, structure of optimizers, and characterization via potential functions or subgradients—and admits flexible hybridizations supporting applications in analysis, data science, and economics (Beiglböck et al., 27 Jan 2025, Guo et al., 26 Nov 2025, Choné et al., 2022).