Sinkhorn Flow in Optimal Transport
- Sinkhorn Flow is the continuum limit of entropy-regularized Sinkhorn iterations, producing an absolutely continuous path in the 2-Wasserstein space.
- It is characterized by a mirror gradient flow of the KL divergence with a non-standard velocity field derived from the Brenier map, linking optimal transport with parabolic Monge–Ampère dynamics.
- The flow exhibits exponential convergence under logarithmic Sobolev conditions and is represented via a McKean–Vlasov diffusion, underpinning its applications in generative modeling and robust optimal transport.
Sinkhorn Flow is the continuum limit of the regularized Sinkhorn algorithm (or iterative proportional fitting procedure, IPFP) for entropy-regularized optimal transport, realized as the number of iterations grows while the entropic penalty vanishes with a precise scaling. In this limit, the alternating projections of the classical Sinkhorn algorithm converge to an absolutely continuous path in the 2-Wasserstein space, whose evolution is governed by a Wasserstein mirror gradient flow of the Kullback–Leibler (KL) divergence. This continuous-time dynamics can be described via both a continuity equation with a non-standard (mirror) velocity field and an associated parabolic Monge–Ampère equation for the dual Brenier potential, with convergence rates, well-posedness, and stochastic process representations under robust analytic conditions.
1. Definition and Conceptualization
The Sinkhorn flow is defined as the scaling limit of the Sinkhorn (IPFP) iterations for the entropically regularized optimal transport problem. Let the regularization parameter ε tend to zero, with the number of iterations proportional to 1/ε. Consider two probability densities, e{–f} (source) and e{–g} (target), on ℝd. When solving the entropy-regularized OT problem with the Sinkhorn algorithm, the sequence of marginal iterates in one coordinate converges, in the large-iteration and small-ε regime, to a continuous path (ρₜ)_t in the 2-Wasserstein space.
This path solves the continuity equation
where the velocity field v_t is not the usual Wasserstein gradient of KL(ρ_t‖e{–f}) but a mirror gradient defined relative to the Brenier map transporting ρ_t to e{–g}. More precisely, the flow is the Wasserstein mirror gradient flow of the KL divergence, with mirror map U(ρ) = ½ W_22(ρ, e{–g}) and objective F(ρ) = KL(ρ‖e{–f}).
2. Mathematical Formulation
There are equivalent characterizations of the limiting dynamics:
- Parabolic Monge–Ampère (PMA) Equation. The potential u_t evolves according to
where is the Brenier map.
- Continuity Equation with Mirror Velocity. The marginal ρ_t is the pushforward and solves
- Mirror Gradient Flow Representation. The evolution is
i.e. the time derivative of the map matches the Wasserstein gradient of KL(ρ_t‖e{–f}).
The metric derivative of (ρ_t)_t with respect to the linearized optimal transport (LOT) distance is exactly the -norm of v_t.
3. Convergence and Regularity Conditions
Rigorous convergence of discrete Sinkhorn iterates to Sinkhorn flow requires several regularity and convexity conditions:
- The initial potential u_0, cost functions f, g are smooth with uniformly bounded derivatives up to order 3 or 4.
- The Hessian matrices are uniformly positive definite and bounded on compact time intervals.
- f and g satisfy a logarithmic Sobolev inequality (LSI), which is also used to establish exponential convergence to equilibrium.
Under these assumptions, the error in Wasserstein distance between the marginals produced by the Sinkhorn algorithm and the continuum Sinkhorn flow is (or ) as ε→0.
4. Connection to Optimal Transport and Mirror Flows
Classical optimal transport uses the 2-Wasserstein metric, and its gradient flow for KL divergence is the standard Fokker–Planck equation. The Sinkhorn flow, by contrast, is a mirror gradient flow: the "mirror" is and the objective is . The gradient is taken in the mirror coordinates defined by the Brenier map. The velocity field is not the Wasserstein gradient of the KL, but its pullback via the mirror transformation, and the metric speed is given by the -norm of with respect to a linearized optimal transport distance.
This structure allows the theory to connect the mirror descent interpretation of Sinkhorn (in the discrete setting) with the continuous-time flow, and, via the PMA equation, with the nonlinear geometry of optimal transport.
5. Parabolic Monge–Ampère Equation and Potential Representation
The Sinkhorn potential—the logarithmic scaling factor in the Sinkhorn algorithm—converges to the solution u_t(x) of the PMA equation:
This gives a "potential" description of the entire flow, so that all the dynamics can be described as an evolution in the dual Brenier potential which generates the sequence of marginal distributions via the Brenier map.
The PMA equation also reinforces the interpretation of Sinkhorn flow as a mirror gradient flow: the evolution of u_t is the mirror descent of KL in the geometry generated by the squared Wasserstein distance to .
6. Exponential Convergence and Logarithmic Sobolev Inequalities
If the target density satisfies a logarithmic Sobolev inequality with constant , then it is proved that the KL divergence decays exponentially along the Sinkhorn flow:
Here, is a time-dependent uniform lower bound on the metric induced by the Hessian of u_t. Exponential convergence in the Wasserstein distance can also be deduced via a suitable HWI inequality. This result parallels the exponential decay of entropy in classical Wasserstein gradient flows but in the non-canonical geometry specific to Sinkhorn flow.
7. McKean–Vlasov Diffusion Representation
There is a stochastic differential equation (SDE) whose time-marginals coincide with ρ_t:
where B_t is Brownian motion and . The coefficients depend on the current solution u_t of the PMA equation, making this a McKean–Vlasov (mean-field) type process. This connection is directly analogous to the relationship between the Fokker–Planck PDE and Langevin diffusion in classical potential theory.
Summary
The Sinkhorn flow emerges as the continuum limit of entropy-regularized Sinkhorn iterations in optimal transport, realized as a Wasserstein mirror gradient flow of KL divergence, with a non-canonical velocity field determined by the mirror geometry of Brenier maps. Its dynamics equivalently solve both a continuity equation with a mirror velocity and a parabolic Monge–Ampère equation for the dual potential. The flow exhibits exponential convergence under a logarithmic Sobolev inequality and can be probabilistically represented by a McKean–Vlasov diffusion. This theoretical framework clarifies the convergence of the Sinkhorn algorithm in strong topologies and establishes its PDE and stochastic process connections for a range of applications in optimal transport, generative modeling, and beyond (Deb et al., 2023).