Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sinkhorn Iterations in Optimal Transport

Updated 18 May 2026
  • Sinkhorn iterations are defined as alternating scaling updates that compute entropy-regularized optimal transport couplings with prescribed marginals.
  • They leverage mirror descent and block coordinate methods to yield geometric convergence and maintain numerical stability.
  • The algorithm is widely applied in machine learning, inverse problems, and optimal transport, with accelerated variants enhancing performance.

Sinkhorn iterations refer to a class of alternating matrix (or operator) scaling algorithms that compute entropy-regularized couplings in optimal transport (OT), as well as projections onto polytopes of matrices/tensors with prescribed marginals. This iterative proportional fitting paradigm realizes efficient solutions to entropy-regularized OT, is fundamental in computational and statistical optimal transport, and finds deep connections to mirror descent, stochastic processes, matrix analysis, PDEs, and scalable regularized linear algebra.

1. Mathematical Formulation and Canonical Algorithm

Given two discrete probability vectors μΔn\mu \in \Delta_n, νΔm\nu \in \Delta_m, and a cost matrix CRn×mC \in \mathbb{R}^{n \times m}, the entropy-regularized OT problem is

minπΠ(μ,ν)i,jCijπij+εi,jπij(logπij1),\min_{\pi \in \Pi(\mu, \nu)} \sum_{i,j} C_{ij}\,\pi_{ij} + \varepsilon \sum_{i,j} \pi_{ij} (\log \pi_{ij} - 1),

where Π(μ,ν)={π0:π1=μ,πT1=ν}\Pi(\mu, \nu) = \{\pi \geq 0 : \pi \mathbf{1} = \mu,\, \pi^T \mathbf{1} = \nu\} is the polytope of couplings (Karimi et al., 2023, Karlsson et al., 2016, Genevay et al., 2017).

The scaling fixed point characterization follows: define the Gibbs kernel K=exp(C/ε)K = \exp(-C/\varepsilon) elementwise. The optimal plan π=diag(u)Kdiag(v)\pi^* = \operatorname{diag}(u)\,K\,\operatorname{diag}(v) for some vectors u>0u > 0, v>0v > 0 such that uu and νΔm\nu \in \Delta_m0 solve

νΔm\nu \in \Delta_m1

The Sinkhorn iterations then alternate these updates: νΔm\nu \in \Delta_m2 where division is elementwise (Karimi et al., 2023, Genevay et al., 2017).

2. Mirror Descent and Optimization-Theoretic Perspective

Sinkhorn iterations arise as alternating Bregman projections or mirror descent in the Kullback–Leibler geometry (Karimi et al., 2023, Chizat et al., 2023). The function νΔm\nu \in \Delta_m3 is convex over couplings, and the classical iteration may be interpreted as block coordinate mirror descent with step size νΔm\nu \in \Delta_m4 alternating projection onto row- and column-marginal sets. Discrete-time updates for the dual potentials (logarithms of νΔm\nu \in \Delta_m5 and νΔm\nu \in \Delta_m6) yield explicit forms: νΔm\nu \in \Delta_m7 with the primal-dual structure making the iteration amenable to convex analysis and stochastic approximation, and allowing for flexible discretizations and generalizations (Karimi et al., 2023).

As the step size vanishes, time-rescaled Sinkhorn iterations converge to a continuous-time flow. The “Sinkhorn flow” satisfies: νΔm\nu \in \Delta_m8 A dual potential flow perspective (in terms of Schrödinger potentials) gives: νΔm\nu \in \Delta_m9 with the coupling reconstructed as CRn×mC \in \mathbb{R}^{n \times m}0 (Karimi et al., 2023, Deb et al., 2023).

This continuous-time limit has been rigorously analyzed as a Wasserstein mirror gradient flow, interpolating between relative entropy (KL) and quadratic cost via the mirror functional CRn×mC \in \mathbb{R}^{n \times m}1. The dynamics can be recast as a parabolic Monge--Ampère PDE in the convex potential CRn×mC \in \mathbb{R}^{n \times m}2: CRn×mC \in \mathbb{R}^{n \times m}3 again highlighting the synergy between discrete iteration and transport PDEs (Deb et al., 2023).

4. Convergence, Complexity, and Phase Transition Phenomena

In the classical square matrix scaling (“Sinkhorn–Knopp”) context, alternating normalization converges linearly in the Hilbert projective metric, and CRn×mC \in \mathbb{R}^{n \times m}4 iterations suffice for dense matrices to achieve error CRn×mC \in \mathbb{R}^{n \times m}5 (He, 13 Jul 2025, Genevay et al., 2017). However, for matrices with subcritical density CRn×mC \in \mathbb{R}^{n \times m}6, the required iteration count can degrade to CRn×mC \in \mathbb{R}^{n \times m}7, revealing a sharp phase transition (He, 13 Jul 2025).

For entropy-regularized OT, non-asymptotic rates show geometric convergence in the potentials and in Wasserstein and relative entropy metrics. These exponential rates extend to non-compact and weakly convex settings under log-concavity assumptions, and adapt to log domain implementations for numerical stability (Conforti et al., 2023, Greco et al., 2023, Conforti et al., 2023).

Complexity per iteration is dominated by two matrix–vector multiplies, CRn×mC \in \mathbb{R}^{n \times m}8 for dense kernels, but can be reduced to CRn×mC \in \mathbb{R}^{n \times m}9 using structured kernels and positive feature approximations (Scetbon et al., 2020, Liao et al., 2022). Damped and stochastic variants further enhance robustness and numerical behavior in practice (Chizat et al., 2023, Karimi et al., 2023).

5. Generalizations and Connections to Broader Mathematical Structures

Sinkhorn-type iterations generalize to a wide class of scaling and constraint imposition problems:

  • Generalized Sinkhorn: Extensions compute the proximal operator of regularized transport and appear in proximal splitting for inverse problems (Karlsson et al., 2016).
  • Operator Scaling and Sinkhorn: Non-commutative (operator) versions, e.g., “Operator Sinkhorn Iteration,” enable scaling for completely positive maps, with applications to quantum information and combinatorial optimization (Eisenmann et al., 13 Mar 2026, Franks et al., 2022).
  • Continuous, Torus, and Gaussian settings: The algorithm adapts to minπΠ(μ,ν)i,jCijπij+εi,jπij(logπij1),\min_{\pi \in \Pi(\mu, \nu)} \sum_{i,j} C_{ij}\,\pi_{ij} + \varepsilon \sum_{i,j} \pi_{ij} (\log \pi_{ij} - 1),0 with unbounded cost, the torus minπΠ(μ,ν)i,jCijπij+εi,jπij(logπij1),\min_{\pi \in \Pi(\mu, \nu)} \sum_{i,j} C_{ij}\,\pi_{ij} + \varepsilon \sum_{i,j} \pi_{ij} (\log \pi_{ij} - 1),1 (with spectral and HJB techniques), and closed-form Riccati flows in Gaussian models (Akyildiz et al., 2024, Conforti et al., 2023, Greco et al., 2023).

Table: Key Sinkhorn Variants and Domains | Domain/Problem | Features | Source | |--------------------------|-----------------------------------|-----------------| | Discrete OT (classical) | Alternating scaling, matrix form | (Genevay et al., 2017, Karimi et al., 2023) | | Operator scaling | Hilbert-metric geodesics | (Eisenmann et al., 13 Mar 2026, Franks et al., 2022) | | Continuous/Quadratic OT | PDE, parabolic Monge-Ampère | (Conforti et al., 2023, Deb et al., 2023) | | Gaussian models | Riccati/Kalman recursions | (Akyildiz et al., 2024) |

6. Algorithms, Stabilization, and Accelerated Schemes

The canonical (classical) implementation is numerically stable for moderate regularization but may require stabilization in the small-minπΠ(μ,ν)i,jCijπij+εi,jπij(logπij1),\min_{\pi \in \Pi(\mu, \nu)} \sum_{i,j} C_{ij}\,\pi_{ij} + \varepsilon \sum_{i,j} \pi_{ij} (\log \pi_{ij} - 1),2 regime due to floating point under/overflow. Log-domain Sinkhorn and periodic centering of dual potentials are standard (Karlsson et al., 2016, Genevay et al., 2017, Wu et al., 6 Feb 2025). Accelerated and hybrid algorithms blend Sinkhorn with sparse Newton steps, yielding super-exponential convergence when approaching the OT vertex (Tang et al., 2024, Wu et al., 6 Feb 2025), and variants—such as damped Sinkhorn, inexact Sinkhorn, and projected-gradient generalizations—enhance stability and extend applicability, e.g., to vector quantile regression (Karimi et al., 2023, Kato et al., 23 Mar 2026).

7. Applications and Extensions

Sinkhorn iterations underpin scalable computation in numerous domains:

A plausible implication is that the versatility and scalability of Sinkhorn-type iterations make them a central primitive for regularized transport in modern statistics, generative modeling, theoretical computer science, and applied mathematics.

References

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sinkhorn Iterations.