Transport-Information Inequalities

Updated 4 June 2026

Transport-information inequalities are bounds that relate the cost of transporting probability measures to divergences like entropy and Fisher information.
They unify functional inequalities, concentration of measure, and PDE regularity through methods such as semigroup interpolation, variational techniques, and curvature-based approaches.
Applications span Euclidean diffusions, discrete Markov chains, quantum systems, and infinite-dimensional processes, offering insights into convergence, stability, and robustness.

Transport-information inequalities (TI), also termed transportation cost-information inequalities, are a central topic at the interface of probability, analysis, and geometry, quantifying how the cost of transporting mass between probability distributions is controlled by information-theoretic divergences such as entropy or Fisher information. These inequalities generalize and connect classical functional inequalities (logarithmic Sobolev, Poincaré), concentration of measure, PDE regularity, and stochastic process theory. Applications span Euclidean spaces, path spaces (diffusions and interacting particles), discrete Markov chains, trees, and even quantum systems.

1. General Formulations and Key Definitions

A transport-information inequality typically bounds an optimal transport cost $T_c(\nu,\mu)$ (for example, a $p$ -Wasserstein metric) between probability measures $\nu$ and reference measure $\mu$ in terms of an "information" divergence such as entropy $H(\nu\mid\mu)$ or Fisher information $I(\nu\mid\mu)$ :

Transport-Entropy (TE) Inequality:

$\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$

for convex increasing $\alpha$ , transport cost $c$ , and relative entropy $H(\nu\mid\mu) = \int \log \frac{d\nu}{d\mu} d\nu$ (Gozlan et al., 2010).

Transport-Information (TI) Inequality:

$p$ 0

with $p$ 1 the Donsker–Varadhan (or relative Fisher) information associated to a symmetric Markov generator $p$ 2:

$p$ 3

for $p$ 4 and Dirichlet form $p$ 5 (Gozlan et al., 2010, Niles-Weed, 13 Mar 2026).

The canonical examples are the $p$ 6-Wasserstein distances $p$ 7, with $p$ 8 and $p$ 9:

$\nu$ 0 (Talagrand's): $\nu$ 1
$\nu$ 2 (quadratic TI): $\nu$ 3

Related divergences include Rényi, $\nu$ 4-divergences, and their convex-analytic variants, all of which can induce transport-type inequalities (Altschuler et al., 2023).

2. Classical Models: Euclidean, Markov Semigroups, and Function Spaces

2.1. Euclidean Diffusions and Semigroup Methods

For $\nu$ 5 on $\nu$ 6 or Riemannian manifolds, TI inequalities can be derived under convexity or curvature-dimension assumptions. A paradigmatic result for $\nu$ 7 (standard Gaussian) is:

$\nu$ 8

Building from this, Kolesnikov (Kolesnikov, 2010) showed (for uniformly convex potentials) that the Fisher information of $\nu$ 9 controls the $\mu$ 0 Sobolev regularity of the transport map, leading to:

$\mu$ 1

Semigroup (Ornstein–Uhlenbeck) interpolation provides De Bruijn-type formulas:

$\mu$ 2

and leads to strengthened inequalities involving the Stein kernel—most notably the Ledoux–Nourdin–Peccati HSI and WSH inequalities, which interpolate and strictly strengthen the log-Sobolev and Talagrand's inequalities when Stein discrepancy is small (Ledoux et al., 2014).

2.2. Lyapunov and Variational Perspectives

TI inequalities can be equivalently characterized by Lyapunov drift conditions: existence of $\mu$ 3 such that

$\mu$ 4

implies $\mu$ 5, i.e., $\mu$ 6-type inequality (Cattiaux et al., 2010, Liu, 2015). The approach is robust under perturbations and admits generalization to non-convex and degenerate settings.

Variational characterizations interpret $\mu$ 7 inequalities as the absence of nontrivial critical points of entropy–transport functionals $\mu$ 8 (Fontbona et al., 2015).

2.3. Markov Process and Product Structure

The $\mu$ 9 inequality for Markov processes is equivalent to dimension-free Gaussian concentration for empirical averages of i.i.d. copies:

$H(\nu\mid\mu)$ 0

for all $H(\nu\mid\mu)$ 1, $H(\nu\mid\mu)$ 2-Lipschitz $H(\nu\mid\mu)$ 3 (Lacker et al., 2020). Such characterizations connect functional, semigroup, and large deviations formulations.

3. Transport-Information in Discrete and Structured Spaces

3.1. Discrete Markov Chains and Graphs

Transport-information inequalities extend to discrete settings (finite Markov chains, graphs), with the discrete gradient operator (carré du champ). Under discrete curvature-dimension (Bakry–Émery, Ollivier), one obtains:

Bakry–Émery CD $H(\nu\mid\mu)$ 4:

$H(\nu\mid\mu)$ 5

Ollivier coarse Ricci $H(\nu\mid\mu)$ 6:

$H(\nu\mid\mu)$ 7

where $H(\nu\mid\mu)$ 8 (Fathi et al., 2015).

On finite graphs, Ma–Wang–Wu develop explicit path-based estimates for the constant $H(\nu\mid\mu)$ 9 in the discrete $I(\nu\mid\mu)$ 0–information inequality:

$I(\nu\mid\mu)$ 1

where $I(\nu\mid\mu)$ 2 is controlled via a sum over network paths and edges (Ma et al., 2015).

3.2. Trees and Bifurcating Markov Chains

For processes on trees, e.g., bifurcating Markov chains modeling cell division, the law of the process up to generation $I(\nu\mid\mu)$ 3 satisfies

$I(\nu\mid\mu)$ 4

with $I(\nu\mid\mu)$ 5 scaling appropriately with tree depth and structure; this enables non-asymptotic concentration bounds for empirical means (Penda et al., 2015).

3.3. Point Processes: Poisson, Binomial

Transport-information inequalities lift via tensorization and symmetrization to Poisson and mixed binomial point processes. In particular, if a base measure $I(\nu\mid\mu)$ 6 satisfies $I(\nu\mid\mu)$ 7, the Poisson process law $I(\nu\mid\mu)$ 8 on the configuration space satisfies

$I(\nu\mid\mu)$ 9

with preservation of constants (Gozlan et al., 2020). Universal Marton-type ( $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 0) inequalities imply robust concentration properties for convex functionals of the random point fields.

4. Functional and Dimensional Extensions

4.1. Reweighted and Mixture Models

For measures $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 1 decomposed as a mixture of components $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 2 each admitting TI inequalities with profile $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 3, Niles-Weed (Niles-Weed, 13 Mar 2026) shows that proximity in Fisher information to $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 4 guarantees proximity in transport (with the same profile) to some reweighted mixture $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 5 in the same family. This "reweighted TI" mechanism allows effective treatment of non-log-concave or multimodal targets in sampling and metastability analysis, and yields structurally optimal convergence rates.

4.2. Quantum and Infinite-Dimensional Models

Quantum extensions of TI are established via quantum Wasserstein distances (quantum Ricci curvature, Petz maps), and show that high-temperature Gibbs states satisfy

$\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 6

with $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 7, yielding quantum Gaussian concentration, equivalence of ensembles, and exponential improvement over previous versions of the Eigenstate Thermalization Hypothesis (Palma et al., 2021).

For infinite-dimensional path spaces (diffusions, SPDEs), TI inequalities control the law of the process, e.g., for solutions of stochastic wave equations, where the constant depends on the time horizon, the "size" of the fundamental solution, and the Lipschitz constant of the drift (Li et al., 2018).

5. Methods of Proof and Structural Results

5.1. Semigroup Interpolation and $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 8-Calculus

Heat-flow (e.g., Ornstein–Uhlenbeck) interpolation and $\alpha(T_c(\nu,\mu)) \leq H(\nu\mid\mu)$ 9-calculus provide both sharp estimates and dynamical proofs, yielding exponential convergence in entropy, moment bounds, and direct connections between log-Sobolev, Talagrand, and TI inequalities (Ledoux et al., 2014, Cattiaux et al., 2010).

5.2. Variational and Duality Techniques

Bobkov–Götze duality, Kantorovich duality, and convex-analytic tensorization allow characterization of TI inequalities in terms of optimization problems on probability measures and test functions, leading to unification with large deviations and Sanov's theorem (Fontbona et al., 2015, Lacker et al., 2020).

5.3. Curvature and Path Methods

Finite and discrete models leverage curvature conditions (Bakry–Émery, Ollivier) and path decompositions (random-path methods), giving explicit dimension and geometry dependence of TI constants (Fathi et al., 2015, Ma et al., 2015).

5.4. Lyapunov Function and Drift

Lyapunov drift conditions provide robust sufficient (and under suitable regularity, necessary) criteria for TI inequalities, extending to general state spaces and degenerate situations (Cattiaux et al., 2010, Liu, 2015).

6. Applications: Concentration, Stability, and Extensions

Concentration of measure: TI inequalities imply sub-Gaussian (or better) deviation bounds for Lipschitz observables and empirical functions, uniformly over system size and complexity (Gozlan et al., 2010, Penda et al., 2015, Ma et al., 2015).
Stability: TI properties are stable under bounded density perturbations (Holley–Stroock), tensorization, and push-forward through measurable maps (Liu, 2015, Gozlan et al., 2020, Niles-Weed, 13 Mar 2026).
Functional inequalities: TI implies and is implied by stronger/related inequalities: log-Sobolev, Poincaré, weak/super-Poincaré, and dimension-free concentration (Liu, 2015, Fontbona et al., 2015, Cattiaux et al., 2010).
Sampling and Markov processes: Reweighted TI enables new guarantees for convergence and sampling in non-log-concave, multimodal, or high-dimensional pathological regimes (Niles-Weed, 13 Mar 2026).
Quantum processes and random matrices: Quantum TCIs provide concentration and mixing results extending the classical theory (Palma et al., 2021).
Interacting particle systems and SPDEs: TI controls concentration, hydrodynamic limits, and deviation principles for high-dimensional, non-product processes (Li et al., 2018, Pal et al., 2018).

Table: Canonical TI Inequalities and Settings

Setting	TI Formulation	Key Constant/Assumptions
Gaussian measure $\alpha$ 0	$\alpha$ 1	Convexity, semigroup smoothing (Gozlan et al., 2010, Ledoux et al., 2014)
Uniformly convex log-density	$\alpha$ 2	$\alpha$ 3 (Kolesnikov, 2010)
Discrete Markov chain	$\alpha$ 4	CD $\alpha$ 5 curvature (Fathi et al., 2015)
Point processes	$\alpha$ 6	Base $\alpha$ 7 condition, tensorization (Gozlan et al., 2020)
Mixtures	$\alpha$ 8	Reweighted over mixture components (Niles-Weed, 13 Mar 2026)
Quantum system	$\alpha$ 9	Curvature, mLSI, or local indistinguishability (Palma et al., 2021)
Path space/SPDE	$c$ 0	Lipschitz drift, Gaussian/correlated noise (Li et al., 2018, Pal et al., 2018)

7. Perspectives and Current Directions

Transport-information inequalities continue to drive new developments in:

Sampling theory for non-log-concave and multimodal distributions, via modular control of Fisher information and proximity to mixtures (Niles-Weed, 13 Mar 2026).
Quantum information geometry and concentration, leveraging non-commutative analogues of curvature and optimal transport (Palma et al., 2021).
Infinite-dimensional and interacting systems, including reflected diffusions and stochastic PDEs (Li et al., 2018, Pal et al., 2018).
Integration with functional analysis, large deviations, and spectral theory via convex-analytic duality and variational frameworks (Fontbona et al., 2015, Lacker et al., 2020).
Discrete and combinatorial models, where curvature, path, and concentration phenomena are realized in non-Euclidean or random environments (Fathi et al., 2015, Ma et al., 2015, Penda et al., 2015).
Explicit computation and optimization of constants and profiles, especially via geometric or path-based methods.
Robustness, stability, and transference under perturbation, tensorization, and map-based contraction (Liu, 2015, Niles-Weed, 13 Mar 2026).

These advances underline the central role of transport-information inequalities as fundamental, structurally robust bridges between geometry, probability, and analysis.