Papers
Topics
Authors
Recent
2000 character limit reached

Weak Optimal Transport Overview

Updated 27 November 2025
  • Weak optimal transport is a generalized framework where cost functions depend nonlinearly on conditional probability distributions.
  • It establishes strong duality with well-defined primal and dual formulations and ensures optimal plan stability through cyclical monotonicity.
  • Applications span economics, finance, and data science, with computational methods including mirror descent and neural approximations.

Weak optimal transport (WOT) is a generalization of the classical Monge–Kantorovich optimal transport framework, where the transport cost between a source point and the target can depend nonlinearly or even nonlocally on the conditional law of the coupling. This broad variational framework, introduced by Gozlan, Roberto, Samson, Tetali, and further developed by numerous others, unifies and extends classical OT, barycentric transport, martingale and entropic optimal transport, and provides new tools and perspectives for analysis, computation, economics, and probability.

1. Formal Framework and Problem Statement

Let X,YX, Y be Polish spaces, μP(X)\mu \in \mathcal{P}(X), νPp(Y)\nu \in \mathcal{P}_p(Y) with p1p \geq 1, and Π(μ,ν)\Pi(\mu, \nu) the set of couplings with marginals μ,ν\mu, \nu. Each coupling πΠ(μ,ν)\pi \in \Pi(\mu,\nu) admits a disintegration π(dx,dy)=μ(dx)πx(dy)\pi(dx,dy) = \mu(dx)\,\pi_x(dy).

The weak optimal transport problem is defined for a measurable cost function C:X×Pp(Y)[0,]C : X \times \mathcal{P}_p(Y) \to [0,\infty], convex and lower semicontinuous (l.s.c.) in the second argument (for the weak or Wasserstein topology). The primal problem is:

WOTC(μ,ν)=infπΠ(μ,ν)XC(x,πx)μ(dx).\mathrm{WOT}_C(\mu, \nu) = \inf_{\pi \in \Pi(\mu,\nu)} \int_X C\big(x, \pi_x\big) \,\mu(dx).

Classical OT corresponds to C(x,ρ)=Yc(x,y)ρ(dy)C(x,\rho) = \int_Y c(x,y)\,\rho(dy).

Duality

The dual problem involves pairs (f,g)L1(μ)×L1(ν)(f,g) \in L^1(\mu)\times L^1(\nu) called admissible if for all xx and ρ\rho, gL1(ρ)g \in L^1(\rho):

f(x)+ρ(g)C(x,ρ).f(x) + \rho(g) \leq C(x, \rho).

The dual value is:

DC(μ,ν)=sup{μ(f)+ν(g):(f,g) admissible}.D_C(\mu, \nu) = \sup\{ \mu(f) + \nu(g) : (f,g)\ \text{admissible} \}.

For fixed gg, one defines

gC(x):=infρPp(Y),gL1(ρ){C(x,ρ)ρ(g)},g^C(x) := \inf_{\rho \in \mathcal{P}_p(Y),\,g \in L^1(\rho)} \{ C(x,\rho) - \rho(g) \},

so that

DC(μ,ν)=supgL1(ν){μ(gC)+ν(g)}.D_C(\mu, \nu) = \sup_{g \in L^1(\nu)} \{ \mu(g^C) + \nu(g) \}.

Key Assumptions

  • Lower boundedness: There exist aL1(μ)a_\ell \in L^1(\mu), bL1(ν)b_\ell \in L^1(\nu) such that C(x,ρ)a(x)+ρ(b)C(x,\rho) \geq a_\ell(x) + \rho(b_\ell).
  • Growth: There exist measurable a,ba, b and a convex, increasing, super-coercive function hh such that C(x,ρ)a(x)+ρ(b)+h(dρ/dν)dνC(x,\rho) \leq a(x) + \rho(b) + \int h(d\rho/d\nu) d\nu.
  • Truncation continuity: If YkYY_k \uparrow Y, C(x,ρ)lim supkC(x,ρYk/ρ(Yk))C(x,\rho) \geq \limsup_{k \to \infty} C\big(x, \rho|_{Y_k}/\rho(Y_k)\big).

Fundamental Theorem

Under these conditions:

  • Primal attainment: the infimum is attained.
  • Strong duality: WOTC(μ,ν)=DC(μ,ν)\mathrm{WOT}_C(\mu, \nu) = D_C(\mu, \nu).
  • Under suitable growth/truncation continuity, dual attainment also holds.
  • Complementary slackness: for a primal optimizer π\pi and dual optimizer (f,g)(f,g), C(x,πx)=f(x)+πx(g)C(x, \pi_x) = f(x) + \pi_x(g) μ\mu-almost surely (Beiglböck et al., 27 Jan 2025).

2. Principal Examples and Recoveries

Barycentric and Convex Costs

If C(x,ρ)=h(xmρ)C(x,\rho) = h(x - m_\rho) with hh convex and mρ=yρ(dy)m_\rho = \int y\,\rho(dy), e.g., h(k)=k2h(k)=|k|^2 (barycentric quadratic cost):

WOTC(μ,ν)=infηcνW22(μ,η)\mathrm{WOT}_C(\mu,\nu) = \inf_{\eta \leq_c \nu} W_2^2(\mu,\eta)

where c\leq_c is the convex order.

The dual becomes: max{μ(ϕ)ν(ϕ):ϕ convex, 1-Lipschitz}.\max\{ \mu(\phi) - \nu(\phi) : \phi \text{ convex, 1-Lipschitz} \}.

Complementary slackness yields a barycentric map T(x)=mπxT(x) = m_{\pi_x} characterized by subgradient conditions (Beiglböck et al., 27 Jan 2025, Cazelles et al., 2021, Guo et al., 26 Nov 2025).

Entropic and Martingale OT

For

C(x,ρ)=c(x,y)ρ(dy)+H(ρν),C(x,\rho) = \int c(x,y)\,\rho(dy) + H(\rho \mid \nu),

the weak-OT becomes entropic OT.

If additionally C(x,ρ)=+C(x, \rho) = +\infty unless yρ(dy)=x\int y\,\rho(dy) = x, one imposes a martingale condition, leading to weak martingale optimal transport (WMOT). Duality and structural results extend to the entropic-martingale context (Beiglböck et al., 27 Jan 2025, Chung et al., 2021, Carlier et al., 20 Nov 2025).

Hybrid Problems

Mixing barycentric and martingale/entropic constraints or costs yields continuous families interpolating between classical OT, martingale OT, and entropic OT, all captured within the WOT framework (Beiglböck et al., 27 Jan 2025, Guo et al., 26 Nov 2025).

3. Cyclical Monotonicity and Structural Optimality

Optimal weak transport plans are characterized by a form of cyclical monotonicity. A coupling π\pi is CC-monotone if for finite families (xi,pi)(x_i, p_i), with competitor measures qiq_i, pi=qi\sum p_i = \sum q_i:

iC(xi,pi)iC(xi,qi).\sum_i C(x_i, p_i) \leq \sum_i C(x_i, q_i).

Necessity and sufficiency of this condition (under extra regularity such as CC being Lipschitz in the measure variable) provide a direct generalization of classical cyclical monotonicity (Veraguas et al., 2018, Backhoff-Veraguas et al., 2019). This also underpins the stability theory: optimal plans are stable under perturbations of marginals or cost, given the adapted topology (which metrizes joint weak convergence of marginals and conditional laws) (Backhoff-Veraguas et al., 2019, Beiglböck et al., 2021).

4. Dynamic, Martingale, and PDE Connections

WOT admits a dynamic (PDE) characterization generalizing the Benamou–Brenier formula:

  • The static weak transport problem is equivalent to a dynamic minimization over curves (ϱt,λt)(\varrho_t, \lambda_t) solving a generalized Fokker–Planck equation with (possibly measure-valued) diffusion tensor and a convex cost–integration (Bulanyi, 2023).
  • Barycentric WOT can be described dynamically using drift–diffusion SDEs, with cost determined by the drift term, and further extended to martingale settings where the drift vanishes and the cost penalizes only covariance (Guo et al., 26 Nov 2025).

This establishes equivalence between static (coupling) and dynamic (PDE/SDE) perspectives for broad classes of convex costs.

5. Computational Methods and Algorithms

Efficient computation of WOT is challenging due to nonlinearity and complexity of the transport constraints.

  • Mirror descent methods: For barycentric and unnormalized-kernel variants (WOTUK), primal and dual variants of mirror descent with entropy mirrors (KL divergence) and Sinkhorn projection are provably convergent and scalable (Paty et al., 2022).
  • Neural approaches: Neural parameterizations of stochastic transport maps can approximate any WOT plan and can be optimized via a max–min (saddle-point) objective; this framework accommodates high-dimensional, nonlinear, and stochastic transport settings (Korotin et al., 2022).

These algorithms have been validated in economics (matching models), machine learning (distributional alignment, barycenters), and vision (image translation).

6. Weak Barycenters and Generalizations

Weak barycenters, defined via minimization of sums of WOT costs over a family of laws, generalize Wasserstein barycenters. Characterization and computation exploit the structure of convex ordering:

  • Existence: Tightness and lower semicontinuity arguments guarantee minimizers under moment conditions (Cazelles et al., 2021).
  • Characterization: Weak barycenters extract common geometric/latent information and have robustness advantages compared to classical barycenters.
  • Algorithms: Deterministic (fixed-point), stochastic (streaming), and optimization (proximal gradient) methods are available.

Open problems concern uniqueness (especially in higher dimensions), stability, and geometric properties.

7. Applications and Extensions

  • Economics: WOT captures nonlinear aggregation in matching models, labor assignment, and production economics, providing structural insights and richer matching patterns than OT (Paty et al., 2022).
  • Finance: WMOT models are fundamental for robust pricing under martingale constraints, with applications including the robust superhedging of options and VIX futures, and with proven stability under distributional uncertainty (Beiglböck et al., 2021).
  • Information Theory: Rate-distortion functions, Shannon bounds, and connections to the Schrödinger bridge are realized within the WOT setting (Zou et al., 16 Jan 2025).
  • Risk Measures: Convex risk measures with WOT penalties yield primal and dual representations, with computational schemes based on variational and neural optimization (Kupper et al., 2023).
  • Metric Geometry and Analysis: Extensions to barycentric costs, entropic regularizations, and transport with moment constraints expand the landscape of metric and probabilistic geometry (Carlier et al., 20 Nov 2025, Chung et al., 2021, Chung et al., 2019).

Table: Core Weak OT Paradigms

Cost Formulation Characteristic Constraint Classical Example
C(x,ρ)=c(x,y)dρC(x,\rho) = \int c(x,y)\,d\rho Linear (classical OT) Wasserstein distance
C(x,ρ)=h(xydρ)C(x,\rho) = h(x-\int y\,d\rho) Barycentric (convex order) Brenier–Strassen
C(x,ρ)=+H(ρν)C(x,\rho) = \cdots + H(\rho|\,\nu) Entropic/Schrödinger regularization Entropic OT
Martingale constraint (E[YX]=XE[Y|X]=X) Mean-preserving (martingale OT) Martingale couplings

This unified convex-analytic perspective recovers and extends key foundational results of optimal transport—duality, structure of optimizers, and characterization via potential functions or subgradients—and admits flexible hybridizations supporting applications in analysis, data science, and economics (Beiglböck et al., 27 Jan 2025, Guo et al., 26 Nov 2025, Choné et al., 2022).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Weak Optimal Transport.