Papers
Topics
Authors
Recent
Search
2000 character limit reached

Log-Probability Variance Overview

Updated 1 April 2026
  • Log-probability variance is a metric that quantifies the dispersion of the logarithm of positive random variables and probability densities.
  • It is used to assess multiplicative variability, enabling robust statistical estimation through geometric means and Bayesian bootstrap approaches.
  • In variational inference, reducing log-probability variance leads to tighter bounds and more stable optimization of evidence lower bounds.

Log-probability variance quantifies the spread or dispersion of the logarithm of random variables, probability densities, or likelihood ratios. This concept arises naturally across probabilistic inference, estimation theory, statistical uncertainty analysis, and information theory—especially when data and models are defined on the positive real line and multiplicative effects or proportional errors dominate. Its operational definitions, estimation properties, and interpretative roles are fundamental in modern statistics, variational inference, and entropy calculations.

1. Formal Definitions

For a positive random variable X>0X>0, the log-probability variance or log-variance is defined as the variance of logX\log X: $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$ The choice of logarithmic basis (natural ln\ln or base-10 log10\log_{10}) is context-dependent; in information theory and many statistical applications, the natural logarithm predominates.

In the context of probability densities, for a random variable XX with density fXf_X, the variance of self-information is given by

Var[I(X)]=Var(lnfX(X))\operatorname{Var}[I(X)] = \operatorname{Var}(-\ln f_X(X))

which, for many important distributional families, is a function of the log-variance plus a shape-dependent constant (0705.4045).

For log-likelihood ratios in variational inference, let r(z)=p(x,z)q(z)r(z) = \frac{p(x,z)}{q(z)}; the log-probability variance is then Varq(logr(z))\operatorname{Var}_q(\log r(z)) (Richter et al., 2020, Huang et al., 2019).

2. Metric Structure and Geometric Interpretation

Log-probability variance emerges naturally from the logarithmic metric on logX\log X0: logX\log X1 For random positive vectors, a corresponding metric is

logX\log X2

The minimizer of the expected squared logarithmic distance to a sample set is the geometric mean, and the corresponding minimal value is the log-variance (Gzyl, 2017). The logarithmic metric is fundamentally multiplicative, and the log-variance quantifies dispersion in multiplicative terms rather than additive ones.

3. Statistical Estimation, Uncertainty, and Bootstrap Methods

For small-sample, high-log-variance data, conventional uncertainty estimates based on the arithmetic mean and its standard error are inadequate, often producing unphysical negative lower confidence bounds. The appropriate approach involves transforming data to log space, where the log-variance is estimated as: logX\log X3 Bayesian bootstrap methods in log space, which assign DirichletlogX\log X4-distributed weights to each datum, yield credible intervals that are more robust than standard bootstrap intervals, especially when logX\log X5 and sample size logX\log X6 is modest. The Bayesian bootstrap avoids the extreme lower-limit bias of the standard bootstrap, which arises when resampled replicates omit rare but dominating large values (Mostofian et al., 2018).

It is recommended to always report the empirical logX\log X7, geometric mean, and Dirichlet-based credible intervals for multiplicative or highly dispersed data. However, the authors caution that neither bootstrap nor Bayesian bootstrap fully resolves the systematic underestimation of the mean when logX\log X8 is very small and logX\log X9 is large.

4. Log-probability Variance in Variational Inference

In variational inference, the variance of the log-likelihood ratio under the variational approximation,

$\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$0

controls the tightness of the variational bound on $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$1. The so-called variational gap $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$2 is upper-bounded as

$\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$3

Thus, variance-reduction techniques that concentrate the distribution of $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$4 or $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$5 (e.g., averaging, correlated sampling) directly tighten the bound on the variational bias (Huang et al., 2019).

The log-variance loss

$\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$6

yields, upon differentiation, the gradient of the (negative) Evidence Lower Bound (ELBO), providing a variance-reduced estimator (VarGrad) that requires no explicit differentiation through Monte Carlo samples (Richter et al., 2020). This estimator achieves variance properties at least as good as, and in many settings better than, the REINFORCE score-function estimator, and is stable in high-dimensional models.

5. Entropy and Information-theoretic Roles

For families of positive random variables closed under transformations $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$7, the entropy $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$8 and the variance of self-information $\operatorname{Var}_\log(X) := \operatorname{Var}(\log X) = \mathbb{E}\left[(\log X - \mathbb{E}[\log X])^2\right]$9 admit closed forms in terms of the mean and variance of ln\ln0: ln\ln1

ln\ln2

where ln\ln3 and ln\ln4 are constants specific to the distribution family but independent of the scaling or power parameters (0705.4045). This framework applies universally to generalized gamma, log-normal, exponential, and gamma distributions.

For example, for the log-normal distribution,

ln\ln5

where ln\ln6 and ln\ln7 are the mean and variance of ln\ln8.

6. Asymptotic Laws and Limit Behavior

When independent and identically distributed ln\ln9 are considered, the empirical geometric mean,

log10\log_{10}0

converges almost surely to the population geometric mean log10\log_{10}1. The central limit theorem in logarithmic distance asserts that

log10\log_{10}2

and the scaled geometric mean converges in distribution to a log-normal, just as the arithmetic mean under Euclidean error converges to a Gaussian (Gzyl, 2017).

Log-variance remains the natural measure of spread in this context, replacing the classical variance for multiplicative data. Empirical estimators for log10\log_{10}3 and log10\log_{10}4 are unbiased and consistent in the logarithmic metric.

7. Practical Computation and Empirical Use

Practical procedures for handling high log-variance data or log-probability estimates are summarized as follows:

  • Always log-transform positive, multiplicatively dispersed data.
  • Estimate sample log-mean and log-variance.
  • Employ Bayesian bootstrap with Dirichlet weights in log space for uncertainty quantification (Mostofian et al., 2018).
  • Use log-variance-based loss functions in variational inference for low-variance gradient estimates, avoiding high-variance score-function methods (Richter et al., 2020).
  • In entropy/information-theoretic settings, compute sample means and variances of log10\log_{10}5 for parameter-free, closed-form estimates of entropy and self-information variance (0705.4045).

Taken together, log-probability variance is a central analytic quantity for describing, estimating, and optimizing in settings dominated by multiplicative randomness, non-Gaussian tails, or information-theoretic criteria. Its estimation and interpretation are critical wherever the arithmetic mean and variance fail to provide robust, physically meaningful, or unbiased results.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Log-Probability Variance.