Martingale Optimal Transport

Updated 10 October 2025

Martingale Optimal Transport is an extension of classical optimal transport that enforces a mean-preserving (martingale) constraint, with applications in model-independent pricing and robust hedging.
It features a variational characterization and complete duality formulation that generalize c-cyclical monotonicity, ensuring optimal couplings under convex order constraints.
The theory underpins numerical schemes such as entropy regularization and linear programming methods, which are crucial for solving complex multi-marginal and pathwise formulations.

Martingale Optimal Transport (MOT) is a generalization of the classical optimal transport problem where the couplings between prescribed marginals are required to respect a martingale constraint. This additional structure is motivated both by mathematical finance—where the no-arbitrage condition manifests as a martingale constraint for asset prices under risk-neutral measures—and by deep connections to model-independent pricing, robust hedging, stochastic control, and geometry. Since its modern inception (Beiglböck et al., 2012), MOT has developed a comprehensive theory including variational characterizations, complete duality for broad classes of cost functions, geometric structure, computational schemes, and numerous extensions to multi-marginal, vector-valued (multi-asset), and path space formulations.

1. Foundations and Variational Characterization

The MOT problem seeks a coupling $\pi$ of prescribed marginals $\mu$ on $\mathbb{R}$ (or $\mathbb{R}^d$ ) and $\nu$ in convex order ( $\mu \leq_{cx} \nu$ ), such that for $\mu$ -almost every $x$ , the conditional law $\pi_x$ (the disintegration of $\pi$ given $X=x$ ) is mean-preserving: $\int y\,d\pi_x(y)=x$ . The canonical MOT problem is

$\inf_{\pi \in \Pi_M(\mu,\nu)} \int c(x,y)\,d\pi(x,y),$

where $\Pi_M(\mu,\nu)$ is the set of probability measures on $\mathbb{R}^2$ with marginals $\mu,\nu$ and the martingale constraint above.

A foundational result is the variational (or local optimality) principle: a martingale coupling $\pi$ is optimal if and only if there exists a full $\pi$ -measure Borel set $\Gamma$ such that for any finite measure $\alpha$ supported on $\Gamma$ and any competitor $\alpha'$ (with the same x- and y-marginals and preserving the conditional mean), one has

$\int c\,d\alpha \leq \int c\,d\alpha'.$

This property generalizes $c$ -cyclical monotonicity from classical transport and characterizes optimality via local barycenter-preserving rearrangements (Beiglböck et al., 2012).

The martingale constraint itself can be succinctly expressed as

$\int \rho(x) (y-x)\,d\pi(x,y) = 0 \quad\text{for all bounded measurable } \rho,$

or equivalently, $\int y\, d\pi_x(y) = x$ for $\mu$ -a.e. $x$ .

2. Monotone Martingale Coupling, Geometry, and Uniqueness

A key structural result is the existence and uniqueness of a "monotone martingale coupling," also called the left-curtain coupling, in dimension one when $\mu$ is continuous and $c(x,y)=h(y-x)$ with $h'$ strictly convex (Beiglböck et al., 2012). In this case, the optimal martingale plan $\pi$ is concentrated on the union of the graphs of two measurable functions $T_1, T_2$ satisfying $T_1(x)\leq x \leq T_2(x)$ and specific monotonicity conditions: $x < x' \implies T_2(x) < T_2(x') \text{ and } T_1(x') \notin (T_1(x), T_2(x)).$ Thus, non-degenerate plans "split" mass at $x$ into at most two outcomes—a consequence of the martingale constraint. If $\mu$ is continuous, the set of $x$ that split to more than two $y$ -values is $\mu$ -null, giving the optimal martingale plan explicit canonical structure. This refinement stands in contrast to classical transport, where the optimal plan is induced by a single (typically monotone) map.

The monotonicity property admitted by the left-curtain coupling analogizes $c$ -cyclical monotonicity: for points $(x, y^-), (x, y^+), (x', y')$ in $\Gamma$ , it is forbidden that $x < x'$ and $y^- < y' < y^+$ . This determines the geometric "leftmost" nature of the optimal transport.

3. Duality Theory and Monotonicity Principles

Duality for MOT differs from the classical Kantorovich dual because of the linear martingale constraint. The correct dual domain consists of triples $(\varphi, \psi, h)$ of measurable functions with

$\varphi(x) + \psi(y) + h(x)(y-x) \geq f(x,y) \qquad \text{for %%%%42%%%%-q.s. %%%%43%%%%},$

where "q.s." denotes quasi-surely, i.e., outside a polar set for martingale measures (Beiglböck et al., 2015). The dual value is

$I_{\mu,\nu}(f) = \inf_{(\varphi, \psi, h)} \mu(\varphi)+\nu(\psi).$

A complete duality holds: for any Borel measurable reward $f$ ,

$\sup_{\pi \in M(\mu,\nu)} \int f\,d\pi = I_{\mu,\nu}(f),$

with existence of a dual optimizer when the value is finite. The contact set

$\Gamma = \Big\{(x,y) : \varphi(x) + \psi(y) + h(x)(y-x) = f(x,y) \Big\}$

characterizes the support of optimal plans, providing a monotonicity principle: a plan is optimal iff it is supported on $\Gamma$ .

In higher dimensions, quasi-sure duality requires notable modifications. Dual constraints and the notion of "tangent convex functions" (e.g., $T_p f(x, y) = f(y) - f(x) - p(x)\cdot(y - x)$ for $p \in \partial f$ ) play a central role, and polar sets may arise from convex geometric barriers. Strong duality holds for upper semianalytic costs under these relaxed (quasi-sure) formulations (March, 2018). Pointwise duality is generally unavailable in dimensions $d > 1$ .

4. Extensions: Multi-Marginal, Vectorial, and Pathwise MOT

The MOT paradigm admits generalizations to multi-marginal and vectorial settings. In multi-period MOT, measures $\mu_1, \ldots, \mu_n$ in convex order yield couplings for $(X_1, \ldots, X_n)$ with each $(X_1, \ldots, X_k)$ forming a martingale (Pass et al., 5 Jun 2025). Gluing lemmas characterize how pairwise optimal transports combine for multiperiod martingales, especially under additive or shared-initial cost functions. In particular, with additively separable cost, gluing optimal two-period couplings yields a multi-period optimizer.

Vectorial MOT [Editor's term for VMOT] further extends to couplings between random vectors with given marginals, imposed so that $E[Y|X]=X$ vectorially (Lim, 2016). The dual problem involves potentials $f_i, g_i$ and strategies $h_i$ for each coordinate, enforcing the martingale and marginal constraints. For strictly convex distance-type costs, the conditional law $\pi_x$ in an optimal $\pi$ is supported on the set of extreme points of its convex hull—an extremal correlation property.

MOT has been formulated for continuous-time path spaces (Skorokhod space). The duality between robust superhedging and the supremum over all martingale measures consistent with the marginals is established in terms of pathwise stochastic integrals (Dolinsky et al., 2014, Cheridito et al., 2019). Here, the robust price for a path-dependent claim $G$ is

$\sup_{Q \in \mathcal{M}} E_Q[G(S)],$

where $S$ is the canonical process and $\mathcal{M}$ is the set of martingale measures consistent with all available market information.

5. Applications in Robust Hedging and Finance

MOT forms the mathematical foundation for robust superhedging and model-independent option pricing. In the robust pricing problem for a path-dependent European option, the super-replication price equals the supremum of expectations over all martingale measures consistent with the observed vanilla options (Dolinsky et al., 2014). The duality framework enables construction of semi-static portfolios (static in vanilla options, dynamic in the underlying) that superreplicate any contingent claim pathwise, even under model uncertainty.

In practice, the left-curtain coupling operationalizes the "extremal" martingale that yields robust price bounds. For multi-asset (vectorial) derivatives, VMOT yields worst-case price bounds when the joint law of assets is only partially observed. The geometry of optimizers offers insight into risk, as the maximal (or minimal) price is attained at couplings with extremal conditional correlation structure. The extension to frictions incorporates state-dependent transaction costs and yields trade bands or no-transaction regions where the identity map is optimal (Rai, 9 Oct 2025). As friction vanishes, the structure converges to the frictionless left-curtain coupling.

Distributionally robust variants relax the exact marginal constraints, requiring only proximity in the Wasserstein metric; stability results quantify the convergence of the robust superhedging price as empirical approximations of marginals improve (Zhou et al., 2021, Backhoff-Veraguas et al., 2019, Wiesel, 2019). Wasserstein-induced penalization is used in entropy-regularized MOT, controlling deviations from prescribed marginals and bridging to the exact MOT in the zero-penalization limit (Doldi et al., 2020).

6. Computational Approaches and Numerical Stability

Finite-dimensional MOT problems can be recast as large-scale linear programs upon discretization of the marginals and state space (Guo et al., 2017). The martingale constraint is enforced as row-wise equality on the coupling matrix. To ensure numerical solvability, regularized variants introduce entropic terms, yielding strictly convex objectives suitable for iterative algorithms such as Bregman projections or entropy-regularized Sinkhorn-like schemes (Tang et al., 25 Aug 2025). Approximate enforcement of the martingale constraint with controllable tolerance circumvents infeasibility due to data or discretization error. Rate-of-convergence results and stability under partial marginal information underpin these computational schemes. In high dimensions, McCormick relaxations modulate bicausal structure, translating nonconvex constraints into tractable linear programs for robust pricing (Bayraktar et al., 2024).

7. Further Extensions and Theoretical Developments

MOT has inspired a wealth of theoretical investigations:

Nonlinear MOT, where cost functionals depend nonlinearly (often via a concave function) on conditional expectations, motivates new geometric structures called curtain transports and two-step reduction arguments (Cox et al., 2019).
Dynamic and geometric formulations, including Benamou–Brenier-type action minimization for continuous-time martingales, link optimal interpolations to Fokker–Planck PDEs and porous medium equations (Huesmann et al., 2017).
Stability, approximation, and regularity theory, including quasi-sure duality in multi-dimensional settings, corrections for polar sets, and continuity of the MOT functional as a map from marginals to prices (March, 2018, Wiesel, 2019).
Frictional MOT incorporates trading frictions (e.g., spread and impact effects), yielding biatomic disintegration on active regions and characterizing the inaction (trade band) zone by dual subdifferential conditions. Vanishing friction recovers the left-curtain coupling (Rai, 9 Oct 2025).
Information-based MOT interprets the coupling problem through stochastic filtering of arcade martingales, structurally modeling the "noise-injection" in martingale constraints and providing new dynamical connections (Kassis et al., 2024).

These developments unify themes from optimal transport, stochastic control, PDE theory, and financial engineering, leading to a robust framework for model-independent risk analysis, hedging, and price discovery in markets with limited information or frictional effects.