Papers
Topics
Authors
Recent
Search
2000 character limit reached

Posterior Contraction Rates

Updated 25 May 2026
  • Posterior contraction rates are defined as the rate at which Bayesian posteriors concentrate around the true parameter using metrics like L2, Hellinger, or Wasserstein.
  • They serve as a critical benchmark in Bayesian nonparametric theory, validating inference approaches in high- and infinite-dimensional settings.
  • Establishing these rates involves verifying entropy and prior-mass conditions, constructing sieves, and leveraging dynamic methods such as the Benamou–Brenier formulation.

Posterior contraction rates quantify the asymptotic speed at which the Bayesian posterior distribution concentrates around the true data-generating parameter or function as sample size increases. They are central to Bayesian nonparametric theory, providing a frequentist benchmarking of Bayesian learning and establishing certainty quantification for Bayesian inference in both parametric and high- or infinite-dimensional models. The mathematical machinery for posterior contraction rates ("PCRs"—Editor's term) has evolved to include information- and transportation-based distances, particularly the Wasserstein metrics, and adapts to dominated, non-dominated, parametric, nonparametric, linear, nonlinear, and even computationally discrete settings.

1. Formal Definition and Conceptual Framework

Given a statistical model {Pθ:θΘ}\{P_\theta : \theta \in \Theta\} with parameter space Θ\Theta (which may be infinite-dimensional), observations X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n), a prior Π\Pi on Θ\Theta, and sample size nn, the posterior Πn(X(n))\Pi_n(\cdot|X^{(n)}) quantifies the conditional belief on θ\theta after observing data. A sequence εn0\varepsilon_n \to 0 is a posterior contraction rate at θ0\theta_0 if, for every Θ\Theta0,

Θ\Theta1

where Θ\Theta2 is an appropriate metric, such as Θ\Theta3, Hellinger, or Wasserstein (Dolera et al., 2020, Dolera et al., 2022, Camerlenghi et al., 2022). This quantifies that posterior mass contracts around Θ\Theta4 at rate Θ\Theta5.

In nonparametric and high-dimensional models, Θ\Theta6 is often taken as a norm on function spaces or a probability metric (e.g., Wasserstein-Θ\Theta7 distance): Θ\Theta8 with Θ\Theta9 the set of couplings of X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)0 (Dolera et al., 2022).

2. Posterior Contraction in Dominated and Non-dominated Models

  • Dominated models: Classical PCRs exploit Bayes' formula; the posterior can be written as

X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)1

where X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)2 is a density with respect to a dominating measure (Dolera et al., 2020).

  • Non-dominated models: In many Bayesian nonparametric constructions (e.g., Dirichlet process mixtures, normalized random measures), no single dominating measure X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)3 exists. In this case, one works with posterior kernels X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)4 arising from disintegration (de Finetti decomposition), and PCRs are defined using, for example, the X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)5 metric on X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)6 (Camerlenghi et al., 2022). The formal PCR is then:

X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)7

which quantifies the expected X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)8-Wasserstein distance of the random posterior to the Dirac mass at X(n)=(X1,...,Xn)X^{(n)} = (X_1, ..., X_n)9.

3. Methodologies for Establishing Posterior Contraction Rates

The general theoretical strategy for establishing PCRs involves verifying a combination of:

Recent developments leverage the Wasserstein metric and the dynamic Benamou–Brenier formulation (Dolera et al., 2020, Dolera et al., 2022), providing two novelties:

4. Quantitative Examples in Parametric, Nonparametric, and Infinite-dimensional Models

Regular parametric models

For regular finite-dimensional models, the posterior contracts at the optimal parametric rate: Π\Pi5 for any prior with positive, continuous density at Π\Pi6 (Dolera et al., 2022).

Nonparametric Dirichlet-Laplace mixtures

For the model Π\Pi7 with Laplace mixing and a Dirichlet process prior, Gao and van der Vaart (Gao et al., 2015) show: Π\Pi8 for the mixing distribution, and

Π\Pi9

for the density, matching minimax lower bounds up to log factors.

Deep Gaussian process priors and compositional classes

Under deep GP priors for functions expressed as compositions Θ\Theta0 with layerwise Hölder or Besov regularity, the contraction rate in Θ\Theta1 is

Θ\Theta2

achieving minimax adaptivity to unknown compositional structure (Finocchio et al., 2021).

Besov-Laplace priors in white noise

For functions Θ\Theta3 (Θ\Theta4) and smoothness-matching Besov-Laplace priors, the strong posterior contraction rate in the Sobolev norm is

Θ\Theta5

matching minimax lower bounds (Dolera et al., 2024).

High-dimensional and nonparametric sparsity

For regression with spike-and-slab or shrinkage priors, the rate is

Θ\Theta6

where Θ\Theta7 is true sparsity, and Θ\Theta8 (Zhang et al., 2019, Naveau et al., 2024).

Non-dominated nonparametric models

For Dirichlet process ({\small DP}) or normalized Gamma process priors in non-dominated settings (i.e., not absolutely continuous in the parameter), the contraction rate in Θ\Theta9 is (Camerlenghi et al., 2022): nn0 where nn1 is the expected Wasserstein convergence of the empirical measure, and the second term depends on the metric entropy and prior concentration.

5. Analytical Structure: Wasserstein Dynamics, Glivenko–Cantelli, and Poincaré Constants

A central methodological innovation is the combination of:

  • Local Lipschitz-continuity of the posterior: If nn2 for sufficient statistics or empirical measures nn3, then PCRs can be controlled by data fluctuation rates (Dolera et al., 2020, Dolera et al., 2022).
  • Dynamic Benamou–Brenier formulation of nn4: The infimum over transport plans is reframed as an infimum over absolutely continuous probability curve flows nn5 solving the continuity equation with minimal kinetic energy (Dolera et al., 2022).
  • Laplace method and weighted Poincaré–Wirtinger constants: Asymptotics in both finite and infinite dimension rely on Laplace expansions and spectral gap estimates for

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Posterior Contraction Rates.