Diffusion on the Probability Simplex

Updated 11 June 2026

Diffusion on the probability simplex is a continuous stochastic process that confines trajectories to normalized probability vectors using degenerate noise and geometric constraints.
Methodologies integrate gradient flows, Itô SDEs, softmax transformations, and mirror maps to maintain conservation within the simplex.
Applications span population genetics, statistical physics, Bayesian inference, and score-based generative modeling, impacting sampling and ergodicity.

A diffusion on the probability simplex is a continuous-time stochastic process whose trajectories remain confined to the standard simplex $\Delta^{d-1} = \{\rho \in \mathbb{R}^d\mid \rho_i \geq 0,~\sum_{i=1}^d \rho_i = 1\}$ , the space of probability vectors of fixed dimension. Such processes arise naturally in a broad spectrum of contexts, including population genetics, statistical physics, Bayesian nonparametrics, Markov process theory, numerical methods for constrained sampling, and modern score-based generative modeling for discrete/categorical data. Multiple non-equivalent constructions exist, governed by different choices of stochastic calculus, boundary conditions, degeneracy structure, and invariant laws.

1. Foundational Structures: Deterministic Gradient Flows and Metric Geometry

The deterministic foundation for simplex diffusions is often a gradient-flow ordinary differential equation

$\dot\rho = -L(\rho)\nabla F(\rho)$

where $F:\Delta^{d-1}\to\mathbb{R}$ is a smooth free energy function and $L(\rho)$ is a symmetric, positive semidefinite Onsager response matrix with $L(\rho)1 = 0$ . This constraint ensures that the flow is tangent to the simplex, i.e., $\sum_{i=1}^d \dot\rho_i = 0$ , so the evolution preserves probability mass.

The induced metric

$g_\rho(u,v) = u^\top L(\rho)^\dagger v$

(with $L(\rho)^\dagger$ the Moore–Penrose pseudoinverse) defines a degenerate Riemannian geometry on the simplex, orthogonal to the direction of the constant vector $1_d$ and adapted to the underlying detailed-balance structure of physical or chemical systems. This geometric apparatus underpins Wasserstein (optimal transport) formulations for discrete and continuous spaces (Gao et al., 2024, Li, 2023).

2. Stochastic Lifts: Langevin and Score-Based Diffusions

Stochastic dynamics are incorporated by introducing a degenerate noise process whose covariance respects the simplex geometry. The prototypical Itô SDE is

$d\rho_t = -L(\rho_t)\nabla F(\rho_t)\,dt + \sqrt{2L(\rho_t)}\,dW_t$

where $\dot\rho = -L(\rho)\nabla F(\rho)$ 0 is a $\dot\rho = -L(\rho)\nabla F(\rho)$ 1-dimensional Brownian motion, and the noise is restricted to the simplex tangent space via spectral decomposition of $\dot\rho = -L(\rho)\nabla F(\rho)$ 2. The resulting process is confined to the simplex and, under mild regularity, is ergodic with respect to an invariant measure (Gao et al., 2024).

Score-based generative modeling approaches instead construct diffusion models defined by mapping Gaussian processes through transformations (e.g., softmax of Ornstein–Uhlenbeck processes) or via mirror maps. These models are equipped with explicit forward and reverse SDEs:

Softmax-based models apply an additive logistic map to unconstrained latent diffusion ( $\dot\rho = -L(\rho)\nabla F(\rho)$ 3, $\dot\rho = -L(\rho)\nabla F(\rho)$ 4), deriving a new SDE for $\dot\rho = -L(\rho)\nabla F(\rho)$ 5 on the simplex (Floto et al., 2023).
Mirror Langevin approaches define SDEs respecting the geometry induced by a convex potential, e.g., negative entropy $\dot\rho = -L(\rho)\nabla F(\rho)$ 6 (Tae, 2023).

In all such constructions, the stochastic process is degenerate at the boundary (corresponding to vanishing simplex coordinates), and the diffusion matrix has rank $\dot\rho = -L(\rho)\nabla F(\rho)$ 7.

3. Fokker–Planck, Laplace–Beltrami, and Invariant Measures

The evolution of densities $\dot\rho = -L(\rho)\nabla F(\rho)$ 8 under simplex diffusions is governed by the Fokker–Planck PDE, which can be written extrinsically or using the Laplace–Beltrami operator of the induced geometry: $\dot\rho = -L(\rho)\nabla F(\rho)$ 9 or, intrinsically,

$F:\Delta^{d-1}\to\mathbb{R}$ 0

where $F:\Delta^{d-1}\to\mathbb{R}$ 1 is the Laplace–Beltrami operator associated to the Riemannian structure $F:\Delta^{d-1}\to\mathbb{R}$ 2. The invariant measure is the generalized Gibbs law

$F:\Delta^{d-1}\to\mathbb{R}$ 3

which incorporates both the energy landscape and the local volume form of the metric. In the drift-free ( $F:\Delta^{d-1}\to\mathbb{R}$ 4) case, the process reduces to a "canonical Wasserstein diffusion" or reversible Brownian motion on the simplex endowed with the discrete graph Wasserstein geometry (Gao et al., 2024, Li, 2023).

4. Absorbing, Boundary, and Entrance Behavior

Boundary behavior is crucial: in finite dimensions, most simplex diffusions have degenerate (square-root-type) noise coefficients vanishing at the boundary, ensuring that once a coordinate reaches zero, it remains pinned there (absorption). This mechanism is central to the uniqueness and existence theory for absorbed martingale problems (Beltrán et al., 2015), and is responsible for the process's stability within the simplex or its lower-dimensional faces. In infinite dimensions (Kingman or Thoma simplex), Petrov’s diffusion and its generalizations have a "mass-deficient" entrance boundary: sample paths starting off the simplex instantly enter the boundary set $F:\Delta^{d-1}\to\mathbb{R}$ 5 and never leave (Ethier, 2014, Korotkikh, 2018, Costantini et al., 23 Feb 2026). The structure of absorption versus entrance determines the long-time ergodicity and support of invariant measures.

For two-point simplexes ( $F:\Delta^{d-1}\to\mathbb{R}$ 6), all canonical constructions reduce to one-dimensional Wright–Fisher diffusions, whose SDE and Fokker–Planck equations admit explicit zero-flux boundary conditions preserving probability conservation within $F:\Delta^{d-1}\to\mathbb{R}$ 7 (Gao et al., 2024, Li, 2023).

5. Analytical and Computational Frameworks

Several computationally tractable frameworks have been proposed:

CIR-based SDEs: Each coordinate is evolved via a Cox–Ingersoll–Ross process, and normalization maps to the simplex. The stationary law is Dirichlet, and exact transition kernels (non-central chi-square) allow unbiased sampling (SCIR), crucial for Bayesian SGMCMC on sparse simplexes (Baker et al., 2018, Richemond et al., 2022).
Score-based simplex diffusions: Using softmax-mapped Ornstein–Uhlenbeck processes or mirror map formalisms leads to forward processes with closed-form densities (logistic-normal, Beta), supporting denoising score-matching objectives (Floto et al., 2023, Tae, 2023).
Averaging and large deviations: In fast-slow systems, averaging principles show that fast diffusions on the simplex, with degenerate coefficients vanishing at vertices, converge (in Meyer–Zheng topology) to Markovian jump processes among the simplex's vertices (Faure, 2021). In small-noise limits, Freidlin–Wentzell theory rigorously establishes the limiting diffusive dynamics on the simplex of invariant measures (Freidlin, 2020).

6. Infinite-Dimensional Generalizations

Infinite-dimensional simplex diffusions extend finite algebraic constructions. Diffusions in the Kingman simplex or the Thoma simplex $F:\Delta^{d-1}\to\mathbb{R}$ 8 model the limiting dynamics of frequencies in infinite-type population models and underlie the construction of Poisson–Dirichlet stationary distributions (Korotkikh, 2018, Ethier, 2014, Costantini et al., 23 Feb 2026). The generators are second-order operators with degenerate covariance structures and explicit "mutation" drift. In the Thoma simplex, Jack and Laguerre symmetric function techniques yield transition densities, spectral expansions, and connections to combinatorial structures (Korotkikh, 2018). The multi-Poisson–Dirichlet extension describes diffusions for systems with labeled frequency components, admitting entrance boundary phenomena and self-similarity properties (Costantini et al., 23 Feb 2026).

7. Applications and Model-Specific Properties

Simplex diffusions appear in:

Population genetics (Wright–Fisher diffusion, infinite-alleles and multi-allele models).
Chemical reaction kinetics, where the simplex encodes concentrations and the gradient-flow structure emerges from thermodynamic principles (Gao et al., 2024).
Generative modeling of discrete data (images, text): simplex diffusions provide continuous-time relaxations for categorical distributions, crucial for recent score-based diffusion models (Floto et al., 2023, Tae, 2023, Richemond et al., 2022).
Bayesian inference and SGMCMC, where unbiased simplex-constrained sampling is critical in sparse topic and mixture models (Baker et al., 2018).
Stochastic control and Markov processes on graphs, using associated Wasserstein metric structures (Li, 2023, Mano, 2022).

The selection of metric (Onsager response, Wasserstein Laplacian, mirror map), invariant measure (Gibbs, Dirichlet, Poisson–Dirichlet), and noise degeneracy structure directly impacts ergodicity, convergence, and suitability for particular algorithmic applications.

These processes have analytical tractability in several canonical regimes (heat flow, reversible diffusions, one- and two-dimensional cases), and support deep connections to combinatorics, optimal transport, and representation theory in their infinite-dimensional forms.