Papers
Topics
Authors
Recent
2000 character limit reached

On the Kantorovich contraction of Markov semigroups (2511.08111v1)

Published 11 Nov 2025 in math.PR and math.ST

Abstract: This paper develops a novel operator theoretic framework to study the contraction properties of Markov semigroups with respect to a general class of Kantorovich semi-distances, which notably includes Wasserstein distances. The rather simple contraction cost framework developed in this article, which combines standard Lyapunov techniques with local contraction conditions, helps to unifying and simplifying many arguments in the stability of Markov semigroups, as well as to improve upon some existing results. Our results can be applied to both discrete time and continuous time Markov semigroups, and we illustrate their wide applicability in the context of (i) Markov transitions on models with boundary states, including bounded domains with entrance boundaries, (ii) operator products of a Markov kernel and its adjoint, including two-block-type Gibbs samplers, (iii) iterated random functions and (iv) diffusion models, including overdampted Langevin diffusion with convex at infinity potentials.

Summary

  • The paper presents an operator-theoretic framework that proves exponential contraction in Kantorovich semi-distances for diverse Markov semigroups using a geometric drift condition and local contraction properties.
  • It rigorously connects weighted total variation norms with Wasserstein metrics via the Kantorovich-Rubinstein theorem, ensuring convergence of stability coefficients.
  • The work has broad implications for MCMC, Gibbs samplers, iterated random functions, and diffusions, enabling practical verification of exponential convergence in complex stochastic models.

Operator-Theoretic Analysis of Kantorovich Contraction in Markov Semigroups

Introduction and Background

This paper presents a comprehensive operator-theoretic framework for analyzing contraction properties of Markov semigroups relative to Kantorovich semi-distances, with particular emphasis on Wasserstein-type metrics. The approach is formulated to encompass both discrete- and continuous-time Markov semigroups, addressing important classes of models: Markov kernels on domains with boundaries, products of Markov kernels and their adjoints (including block Gibbs samplers), iterated random functions, and diffusions such as overdamped Langevin dynamics with convex-at-infinity potentials.

The central objects are the semigroup of Markov operators PnP_n, acting on probability measures over a complete, separable metric space (S,ψS)(S,\psi_S), and equipped with weighted total variation norms (VV-norms) induced by a lower semi-continuous Lyapunov function VV. The primary focus is on the rate at which the Kantorovich semi-distance between the evolutes μ1Pn\mu_1 P_n and μ2Pn\mu_2 P_n decreases, with particular consideration of conditions under which contraction occurs exponentially.

Kantorovich Semi-Distance and VV-Norms

Given a semi-distance ϕ\phi on SS, the Kantorovich semi-distance DϕD_\phi is defined via optimal couplings as:

Dϕ(μ1,μ2)=infπΠ(μ1,μ2)S2ϕ(x,y)dπ(x,y),D_\phi(\mu_1,\mu_2) = \inf_{\pi \in \Pi(\mu_1, \mu_2)} \int_{S^2} \phi(x, y) \, d\pi(x, y),

where Π(μ1,μ2)\Pi(\mu_1, \mu_2) denote joint probability measures with μ1\mu_1 and μ2\mu_2 as marginals. Special cases include Wasserstein-pp distances (ψSp\psi_S^p), total variation (φ0\varphi_0), and VV-norms (φV\varphi_V).

The authors confirm, using the Kantorovich-Rubinstein theorem, that the weighted total variation norm V\|\cdot\|_V is equivalent to the Kantorovich semi-distance associated with a weighted discrete metric. Dual formulations are leveraged for technical comparison and in developing estimates.

Uniform Contraction: Dobrushin Coefficients

The main analytic tool is the Dobrushin contraction coefficient, defined in a general setting for a Markov operator PP and two semi-distances (ϕ,ψ)(\phi, \psi) as:

βψ,ϕ(P)=sup(μ1,μ2)Dϕ(μ1P,μ2P)Dψ(μ1,μ2),\beta_{\psi,\phi}(P) = \sup_{(\mu_1, \mu_2)} \frac{D_\phi(\mu_1 P, \mu_2 P)}{D_\psi(\mu_1, \mu_2)},

with supremum over pairs with Dψ(μ1,μ2)>0D_\psi(\mu_1, \mu_2) > 0. The authors rigorously prove that optimizing over all probability measures reduces to optimizing over Dirac measures.

This definition allows for a comparison between the evolution under PP in two different metrics, and forms the cornerstone for deriving contraction rates. By careful comparison principles and scaling properties (e.g., Lemma "klem"), the authors facilitate translation of contraction in weighted norms to contraction in Wasserstein and other metrics.

Main Theoretical Contributions

The central results establish conditions under which the Dobrushin contraction coefficients decay exponentially for iterates of PP. The operator-theoretic framework requires only:

  • A geometric drift condition (standard Lyapunov function constraint): P(V)ϵV+cP(V) \leq \epsilon V + c for some ϵ<1\epsilon < 1, c0c \geq 0.
  • A local contraction property, typically formulated for a semi-distance κ\kappa: on sets where ϖV(x,y)r\varpi_V(x, y) \leq r, Dκ(δxP,δyP)(1α(r))κ(x,y)D_\kappa(\delta_x P, \delta_y P) \leq (1-\alpha(r)) \kappa(x, y).

The authors prove (Theorem~1) that under these assumptions, the contraction coefficients for appropriately constructed VV-weighted semi-distances decay exponentially. Notably, the framework avoids reliance on explicit coupling constructions or specialized metrics, instead leveraging general operator-theoretic arguments.

Key Numerical Results and Claims

  • Exponential Convergence: For a wide class of models, the contraction coefficients βϕ(Pn)\beta_\phi(P_n) and βφV,ϕ(Pn)\beta_{\varphi_V, \phi}(P_n) converge to zero exponentially, with explicit bounds provided.
  • Invariance and Uniqueness: The strict contraction guarantees the existence and uniqueness of an invariant probability measure of the Markov operator in the appropriate VV-weighted space.
  • Comparison and Generalization: The results allow direct generalization to Wasserstein-pp distances and to domains with boundary states, as well as non-smooth and multi-block constructions (e.g., Gibbs samplers), subsuming previous results in the literature.

Applications to Specific Model Classes

Markov Chains in Bounded and Unbounded Domains

For Markov kernels with strictly positive, continuous, and possibly unbounded densities (e.g., on ]0,1[]0,1[ or [0,[[0, \infty[), the Lyapunov construction ensures contraction estimates in Wasserstein metrics based on domain geometry (e.g., proximity to boundary via d(x,S)d(x, \partial S)).

Gibbs Samplers and Operator Products

The analysis extends naturally to two-block Gibbs samplers and other samplers involving adjoints of Markov kernels. The authors provide novel Lyapunov design criteria for these models, demonstrating that geometric drift and local contraction apply to the operator product, yielding exponential stability results.

Iterated Random Functions and Diffusions

For iterated random function systems (IRFS) and continuous-time diffusions, contraction rates are derived under dissipativity conditions on the drift and appropriate regularity of the noise or transition densities. For the overdamped Langevin model with convexity outside a ball, the authors show exponential contraction in Wasserstein-pp metrics for polynomial and exponential Lyapunov functions.

Theoretical and Practical Implications

Theoretical Consequences:

  • The paper presents a unified approach to stability, avoiding the need for intricate coupling or semimetric construction, and delivering short, direct proofs.
  • The results extend operator-theoretic tools for positive semigroups to non-total variation metrics, resolving stability questions in optimal transport and Markov process literature.

Practical Implications:

  • Provides conditions for exponential convergence in Wasserstein distance, facilitating analysis of Markov Chain Monte Carlo (MCMC) samplers, stochastic differential equations, and Gibbs samplers, including multi-block and large-scale Bayesian models.
  • The operator-theoretic perspective simplifies implementation by reducing the stability analysis to verifiable Lyapunov and contraction conditions, with little need for tailoring metric constructions to particular domains or models.

Future Directions:

Potential extensions include:

  • Optimizing Lyapunov functions for improved contraction constants in high-dimensional applications.
  • Extending results to non-homogeneous or time-dependent semigroups, including those with degenerate noise or boundary behavior.
  • Incorporation of algorithmic techniques (e.g., Sinkhorn scaling, proximal samplers) within this framework for scalable computation in statistical inference and machine learning.

Conclusion

The paper provides an elegant and robust operator-theoretic framework for establishing exponential contraction in Kantorovich semi-distances, including Wasserstein metrics, for Markov semigroups. The abstraction and generality of the approach allow it to encompass a wide variety of stochastic processes, including models with boundary states, operator products, and diffusions. The results not only unify previous stability analyses but also extend them substantially, offering direct guidance for practical implementation and analysis in computational stochastic processes and related areas in applied mathematics and statistical learning.

Whiteboard

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 17 likes about this paper.