Kantorovich Duality: Theory & Applications
- Kantorovich duality is a fundamental principle equating the minimal transport cost with the maximal value over potential functions under cost constraints.
- It extends to multimarginal, quantum, and abstract settings, offering versatile tools for analysis in statistics, physics, and machine learning.
- The dual formulation underpins efficient numerical methods and stability analysis through properties like c-conjugacy and cyclic monotonicity.
Kantorovich duality is the central principle equating the minimal cost in an optimal transport problem with the maximal value of a dual problem formulated in terms of certain potential functions. In its original and most general forms, this duality provides a convex-analytic bridge between primal minimization of transport costs over couplings and dual maximization over potential families constrained by the cost function. The classical two-marginal formulation has extensive generalizations: to multi-marginal transport, non-additive set functionals (capacities), quantum states, abstract measure-theoretic settings, and beyond. Modern developments, particularly for multimarginal problems, have established canonical forms for optimal dual potentials and their structural connection to the geometry of optimal transport plans, providing foundational tools for analysis in probability, statistics, physics, and machine learning.
1. Classical Kantorovich Duality
The optimal transport problem seeks to minimize the total cost of transporting mass from one distribution to another. Given probability measures on and on and a measurable cost , transport plans are measures on with marginals , . The primal problem is
Kantorovich duality (Gover, 23 Jan 2025, Moameni, 2014, Gozlan et al., 2014) states that, under mild regularity (e.g., 0 lower semicontinuous and finite), one always has
1
where the supremum is over suitable (measurable or continuous) potentials. This duality is exact (no gap), and optimal 2, and plan 3 can often be attained.
Complementary slackness holds: for any optimal pair, 4 on the support of any optimal plan. The optimal potentials form a so-called 5-conjugate pair, and support of an optimal plan is contained in the contact set. This duality underlies all modern analytical and algorithmic approaches to optimal transport.
2. Multimarginal Kantorovich Duality
The theory extends to 6 marginals 7 on spaces 8, transporting mass across 9 with cost 0 (Cheryala et al., 23 Jan 2026, Moameni, 2014). The primal problem is
1
The dual maximizes the sum of integrals of potentials 2, each 3, under the constraint 4 for every 5. That is,
6
Key structural results include:
- Canonical c-conjugate potentials: For each 7, define the 8-conjugate
9
There always exists an optimal family where all 0 are fully 1-conjugate.
- c-cyclical monotonicity and c-splitting sets: The support of any optimal plan is always 2-cyclically monotone (a permutation-invariant generalization of cyclical monotonicity for 3), and coincides with the 4-splitting set for the optimal potentials (Cheryala et al., 23 Jan 2026, Moameni, 2014).
- Dual attainment and stability: In both compact and non-compact settings (Polish spaces, bounded continuous 5), the supremum in the dual is attained via tightness, equicontinuity, and truncation procedures. The dual value is preserved under restriction to growing compacts, and dual families can be regularized to be uniformly bounded.
This structure supports development of numerical schemes and underpins stability, sensitivity, and differentiability analysis of the multimarginal transport cost (Cheryala et al., 23 Jan 2026, Moameni, 2014).
3. Abstract Duality, Generalizations, and Vector-Valued Extensions
Kantorovich duality is an instance of a broad family of convex-analytic duality theorems (Gover, 23 Jan 2025). In an abstract cone setting, one may encode constraints and objectives via convex cones and super/sublinear mappings (e.g., dualizing via Hahn–Banach separation), which reveal optimal transport as a special case. This approach yields:
- General vector-measure transport dualities (for 6-valued measures), with dual constraints
7
where 8 is a prescribed density, extending the classical scalar case.
- Existence and structural criteria for primal solutions, dual maximizing potentials, and transport maps in semi-discrete and infinite-dimensional problems, using the same convex-analytic principles.
This abstract viewpoint encompasses further duality settings: moment problems, zero-sum continuous games, Fenchel–Rockafellar duality, martingale optimal transport, and beyond (Gover, 23 Jan 2025).
4. Duality in Extended Contexts
a) Non-classical Settings: Capacities, Quantum Transport, Ergodic and Infinite-Dimensional Problems
- Capacities: Replacing probability measures by upper- or lower-continuous set functions (capacities), the primal cost becomes a Choquet integral. The dual is formulated in terms of potentials integrated with respect to capacities, with duality, attainment, and cyclic monotonicity retained under supermodularity (Gal et al., 2019).
- Quantum Optimal Transport: Quantum Kantorovich duality extends to states (density operators) on Hilbert spaces, with dual variables being self-adjoint operators acting as operator-valued “potentials.” The Fenchel–Rockafellar framework applies, and strong duality holds between the linearized quantum primal and operator-constraint dual, with explicit computations in the qubit case (Bunth et al., 30 Oct 2025).
- Ergodic Transport: For systems with invariant measures under dynamics (e.g., shifts on Bernoulli space), transport plans are required to be invariant, and duality is between cost minimization and maximizing over potentials satisfying coboundary constraints. The standard convex duality setup is adapted to the ergodic context (Lopes et al., 2012).
- Infinite Dimensions: In Banach, Hilbert, or Fréchet spaces, the role of cylindrical functions is clarified: the Kantorovich dual can often be restricted to potentials of the form 9, where 0 is a finite-rank projection, reducing infinite-dimensional duality to finite-dimensional approximations (Zaal, 2015).
b) General Measurable, Topological, and Capacity Constraints
Kantorovich duality extends to arbitrary measurable spaces, capacity-functions, and completely regular Hausdorff spaces. Under broad continuity or separability hypotheses, corresponding primal and dual optimization problems exhibit strong duality and complementary slackness, with necessary adaptations (e.g., via intermediate problems and compactness arguments, or rectified lower semicontinuous costs when necessary) (Bachir, 5 Mar 2025, Rigo, 2019, Beiglboeck et al., 2010).
Rectification of cost functions ensures duality holds even if 1 is not lower semicontinuous; the minimal rectification 2 restores duality without altering the dual problem, demonstrating the role of 3's regularity (Beiglboeck et al., 2010).
5. Structural Properties: c-Cyclical Monotonicity, Splitting Sets, and Support Geometry
A defining feature of optimal Kantorovich plans is that their support lies on 4-cyclically monotone sets—sets inert to permutational cost-splitting cycles (Cheryala et al., 23 Jan 2026, Moameni, 2014). In the multimarginal setting, this property admits a precise measure-theoretic and geometric description:
- Splitting sets and twist conditions: The existence and structure of optimal plans are governed by 5-splitting sets and 6-twist (injectivity) conditions on 7. Under suitable twist hypotheses, optimal plans concentrate on unions of finitely many graphs, a direct generalization of the Monge solution in the classical case (Moameni, 2014, Moameni, 2014).
- Canonical potentials and support characterization: Optimal dual potentials can always be taken in a canonical 8-conjugate form, and the support of any optimal plan is fully described by the points on which these potentials achieve equality with 9.
Such structural descriptions are central in statistical learning applications (e.g., barycentric mapping, stability) and in numerical schemes (e.g., iterative c-transform algorithms) (Cheryala et al., 23 Jan 2026).
6. Applications and Extensions
Kantorovich duality underpins a wide array of results and methodologies.
- Machine Learning and Statistics: Multimarginal duality forms the basis for multi-way distribution alignment, barycentric map learning, stability analysis under marginal perturbations, and the design of critic potentials in generative modeling (Cheryala et al., 23 Jan 2026). Sensitivity and differentiability results follow from duality and 0-conjugacy properties.
- Generalized Transport Metrics: Duality extends to weak and partial transport (including total variation penalties), martingale transport, and convex powerset metrics (involving Hausdorff and Wasserstein compositions), often via generalized lifting constructions and predicate modalities (Wild et al., 27 Oct 2025, Chung et al., 2019, Gozlan et al., 2014).
- Quantum Information: Quantum duality determines distances and divergences between quantum states, with triangle inequalities and explicit optimizers analyzed for specific observables (Bunth et al., 30 Oct 2025).
- Urban Planning, Branched Transport, and Mechanics: Duality principles yield equivalence between transport formulations on urban networks and branched structures, offer Beckmann–type flux representations, and solve structural optimization problems in mechanics (e.g., grillage design via Hessian-constrained duality) (Lohmann et al., 2022, Bołbotowski et al., 2024).
- Infinite-Dimensional and Abstract Spaces: In Polish, completely regular, or Hausdorff spaces, duality proofs leverage properties of Baire measures, equicontinuity, and Lagrange multiplier techniques, generalizing Prokhorov's compactness to broader contexts (Bachir, 5 Mar 2025).
7. Computational and Algorithmic Implications
Kantorovich duality provides both convex analytical foundations and concrete algorithms:
- The dual formulation gives tractable numerical procedures (e.g., via iterative c-transform algorithms, linear or semidefinite programming in finite cases, Fenchel–Rockafellar projection methods).
- Deterministic and stochastic algorithms for barycenters, DRO (distributionally robust optimization), and weak transport all exploit dual characterizations for computational efficiency and convergence guarantees (Zhang et al., 2022, Wild et al., 27 Oct 2025, Chung et al., 2019).
- In learning contexts, dual potentials (critics) can be regularized or parametrized efficiently; canonical c-conjugate families serve as key tools for regularization and stabilization (Cheryala et al., 23 Jan 2026).
Summary Table: Kantorovich Duality in Key Settings
| Setting | Primal Formulation | Dual Variables / Potentials | Duality Condition |
|---|---|---|---|
| Classical (2 marginals) | 1 | 2 | 3 |
| Multimarginal | 4 | 5 | 6 |
| Capacities (Choquet) | 7 (Choquet) | 8 (continuous) | 9 |
| Quantum | 0 | 1 (self-adj. operators) | 2 |
| Ergodic | 3 | 4 (coboundaries) | 5 |
| Weak/Partial TV | 6 | 7 | See text |
References
- "A Unified Kantorovich Duality for Multimarginal Optimal Transport" (Cheryala et al., 23 Jan 2026)
- "Multi-marginal Monge-Kantorovich transport problems: A characterization of solutions" (Moameni, 2014)
- "Kantorovich duality for optimal transport on completely regular Hausdorff spaces" (Bachir, 5 Mar 2025)
- "Duality Theorems and Vector Measures in Optimal Transportation Theory" (Gover, 23 Jan 2025)
- "Kantorovich duality for general transport costs and applications" (Gozlan et al., 2014)
- "Generalized Kantorovich-Rubinstein Duality beyond Hausdorff and Kantorovich" (Wild et al., 27 Oct 2025)
- "Strong Kantorovich duality for quantum optimal transport with generic cost and optimal couplings on quantum bits" (Bunth et al., 30 Oct 2025)
- "A note on duality theorems in mass transportation" (Rigo, 2019)
- "Dual potentials for capacity constrained optimal transport" (Korman et al., 2013)
- "Multi-marginal Monge-Kantorovich transport problems: A characterization of solutions" (Moameni, 2014)
- "Kantorovich Duality and Optimal Transport Problems on Magnetic Graphs" (Robertson, 2019)
- "Kantorovich's Mass Transport Problem for Capacities" (Gal et al., 2019)
These provide comprehensive accounts of all the claims, structural results, and generalizations discussed above.