Bahadur-Rao Large Deviation Formula

Updated 20 January 2026

The Bahadur–Rao formula is an explicit asymptotic expansion that quantifies rare event probabilities by combining a dominant exponential rate with a detailed polynomial prefactor.
It refines classical large deviation principles by incorporating exact error terms and constants derived through saddle-point and Laplace methods.
Extensions cover i.i.d. and dependent variables, lattice cases, quantiles, and high-dimensional functionals, broadening its impact in probability theory.

A Bahadur-Rao type large deviation formula gives an explicit and sharp asymptotic expansion for probabilities of rare events—specifically, deviations of sums or other functionals of random variables far into their tails. The canonical form involves a dominant exponential term described by a rate function (the Legendre–Fenchel transform of the cumulant generating function) and a precisely characterized polynomial prefactor, which contrasts with standard large deviation results that only yield logarithmic asymptotics. The Bahadur–Rao expansion quantifies not just the rate but the exact scaling and constants, frequently up to leading order with error terms, and has become the reference point for the precise analysis of large deviations in both classical and high-dimensional probability contexts.

1. Definition and Classical Formulation

Let $X_1,\dots,X_n$ be i.i.d. real-valued random variables with cumulant generating function $\varphi(\theta) = \ln \mathbb{E}[e^{\theta X_1}]$ finite in a neighborhood of zero. The Cramér rate function is $I(x) = \sup_{\theta} \{\theta x - \varphi(\theta)\}$ , and $\theta_x$ is the unique solution of $\varphi'(\theta_x) = x$ . For $x$ exceeding the heavy-tail threshold (i.e., $x > \mathbb{E} X_1$ ), the classical nonlattice Bahadur–Rao expansion states: $\Pr\left\{S_n \geq n x\right\} = \frac{\exp(-n I(x))}{\sqrt{2\pi n V(x)}\, \theta_x}\left[1 + O(n^{-1})\right], \quad V(x) = \varphi''(\theta_x)$ valid uniformly for $x$ in compact intervals above the mean. In the lattice case, the prefactor is replaced by $d/(1 - e^{-d\theta_x})$ , with $d$ the lattice span (Gyorfi et al., 2012).

This characterization sharply contrasts with the large deviation principle, which asserts

$\lim_{n \to \infty} \frac{1}{n} \log \Pr\left\{S_n \geq n x\right\} = -I(x)$

but leaves prefactors and finer corrections unspecified.

2. Generalizations to Sums of Bounded and Weakly Dependent Variables

Sharp Bahadur–Rao type expansions have been extended to sums of independent bounded random variables under $(2+\delta)$ -moment drift conditions. For independent, mean zero r.v.'s $\xi_{1},\dots,\xi_n$ with $\xi_i\leq 1$ and $\mathbb{E}\lvert \xi_i \rvert^{2+\delta} \leq B^{2+\delta}$ for some $B>0$ , $\delta \in (0,1]$ , the expansion reads

$P\{ S_n \geq x \sigma \} = \left\{ \Theta(x) + O(B/\sigma + \sigma^{-\delta}) \right\} e^{-n \Lambda_n^*(x \sigma / n)}$

where $\sigma^2 = \sum_{i=1}^n \mathbb{E}\xi_i^2$ , $\Lambda_n^*$ is the Legendre transform, and $\Theta(x) = (1 - \Phi(x)) e^{x^2/2}$ with $\Phi$ the standard normal cdf. The prefactor $1/(x \sqrt{2\pi}) + o(1)$ replaces the more distribution-dependent factor in the classical formula (Fan et al., 2012).

For variables satisfying Bernstein's condition, analogous results with uniform error bounds over extended $x$ ranges recover the Cramér-Bahadur–Rao expansion and interpolate between moderate and extreme deviation asymptotics (Fan et al., 2012).

3. Extensions to Lattice Variables, Quantiles, and Structured Functionals

The lattice case yields an explicit correction in the prefactor, e.g., for Binomial variables: $\Pr\{\mathrm{Bin}(n, p) \geq k\} = \frac{\exp(-n I(u))}{\sqrt{2\pi n u(1-u)}} \frac{u(1-p)}{u-p}\left[1 + O(n^{-1/2})\right]$ with $u = k/n$ , illustrating both the modularity and explicitness of the expansion (Gyorfi et al., 2012).

For sample quantiles, the expansion evaluates the probability that the empirical $p$ -quantile exceeds the population quantile by $t$ : $\mathbb{P}\left(x_{n,p} - x_p \geq t\right) = \frac{1}{\tau_t^+\, \sigma_p\, \sqrt{2\pi n}} \exp\left\{ -n \Lambda^+(t) \right\} [1 + o(1)]$ where $\tau_t^+$ is the saddle-point parameter, $\Lambda^+(t)$ the rate function depending on the difference between empirical and population quantiles, and $\sigma_p = \sqrt{p(1-p)}$ (Fan, 2023).

High-dimensional and geometric functionals, such as the ratio of geometric mean to arithmetic mean in random $\ell_p$ -balls, also satisfy two-dimensional or multidimensional analogues, with carefully constructed prefactors involving curvature and Jacobian corrections from saddle-point methods (Kaufmann et al., 2021).

4. Extensions to Structured Models: Products of Random Matrices, Triangular Arrays, and Threshold Models

For the log-norm of products of random matrices $(A_n \cdots A_1)$ , Bahadur–Rao expansions generalize using transfer operators and non-commutative moment generating functions. Under irreducibility and moment conditions,

$\P_x\left\{ S_n^x \geq n q \right\} \sim \frac{C_s(x)}{\sigma \sqrt{2\pi n}} e^{-n \Lambda^*(q)}$

with $C_s(x)=r_s(x)/s$ , $S_n^x = \log \|A_n \cdots A_1 x\|$ , and $\Lambda^*$ the Legendre transform of $\log k(s)$ , defined via the dominant eigenvalue $k(s)$ of a transfer operator (Buraczewski et al., 2014). Analogous results hold for random walks on $GL(d, \mathbb{R})$ for linear coefficients of the walk under tilt (Xiao et al., 2020).

Dependent array models arising in credit risk, threshold models, and portfolios further refine the prefactor: for Gaussian/exponential tails, the correction is polylogarithmic, for heavy tails it is polynomial, and for bounded-support cases, the prefactor is $n^{-3/2}$ due to boundary contributions from Laplace–Olver analysis (Deng et al., 23 Sep 2025). The regime dictates the scaling and reflects the geometry and latent dependence structure.

5. Analytical Techniques Underpinning Bahadur–Rao Expansions

The proofs universally employ an exponential change of measure (Esscher tilt or Cramér transform), mapping the rare event under the original measure to a moderate deviation under a tilted law. The precise probability is then formulated as a Laplace (or Fourier) integral, with the dominant contribution isolated by a saddle-point (Laplace method or saddle-point method). Gaussian or Edgeworth expansions under the tilted measure yield the polynomial and constant corrections in the prefactor. For lattice cases, renewal theoretic or geometric series arguments produce further corrections.

In multivariate and infinitely divisible settings, the expansion entails higher-dimensional saddle-point analysis, where the fluctuation determinant and boundary curvature contribute nontrivial corrections (e.g., in the p-AGM and random matrix settings).

6. Applications, Regimes, and Connections to Other Large Deviation Results

Bahadur–Rao expansions provide numerically sharp estimates for rare-event probabilities, enabling precise confidence bounds for quantiles, high-dimensional functionals, threshold exceedances in risk models, and asymptotically exact tail asymptotics in random matrix theory and combinatorial coefficients such as d'Arcais numbers. They unify the moderate and large deviation regimes, showing that CLT-based approximations interpolate smoothly into exponential tail probabilities, with explicit error control (Gyorfi et al., 2012, Fan, 2023).

Comparisons indicate that for bounded or lattice random variables, the universality of the prefactor contrasts with the distribution-dependent correction in classical settings. For dependent and array models, the scaling of the prefactor reveals new regimes not detected by standard LDP—e.g., the $n^{-3/2}$ prefactor in boundary cases, and critical exponents in heavy-tail scenarios (Deng et al., 23 Sep 2025).

7. Summary Table: Canonical Bahadur–Rao Expansions

Setting	Rate function	Prefactor
Classical i.i.d., nonlattice	$I(x)$	$[2\pi n V(x)]^{-1/2} \theta_x^{-1}$
Lattice case	$I(x)$	$[2\pi n V(x)]^{-1/2}\, d/(1 - e^{-d\theta_x})$
Bounded summands	$\Lambda^*(x)$	Universal: $\Theta(x) \sim 1/(x\sqrt{2\pi})$
Quantiles	$\Lambda^+(t)$	$(\tau_t^+ \sigma_p \sqrt{2\pi n})^{-1}$
Random matrix norm	$\Lambda^*(q)$	$C_s(x)/[\sigma\sqrt{2\pi n}]$
Portfolio/threshold models	See text	$n^{-1/2}$ , $n^{-3/2}$ , or log/power corrections

Each formula sharpens the respective LDP by providing the constant and scaling in the non-exponential prefactor, which is essential for high-precision probabilistic estimates.

References: