2000 character limit reached

Unilaterally Truncated Gaussian Distributions

Updated 17 November 2025

Unilaterally truncated Gaussian distributions (UTGDs) are defined by conditioning a Gaussian variable to exceed a fixed threshold, yielding closed-form moments and specialized sampling methods.
They are widely applied in state estimation, control systems, and machine learning to handle non-negativity constraints and censored data.
Advanced estimation and simulation techniques facilitate precise parameter recovery and variance calibration, ensuring rapid convergence and robust modeling.

A unilaterally truncated Gaussian distribution (UTGD) arises when a Gaussian random variable is restricted to a semi-infinite interval—typically, values greater than a fixed threshold. UTGDs are ubiquitous in applications that impose hard physical or logical constraints, including state estimation with non-negativity requirements, truncated noise processes in control, structured graphical models with rectifying nonlinearities, and the analysis of incomplete or censored data. Theoretical paper of UTGDs centers on their closed-form moments, characterizations of maximal variance, efficient parameter estimation, sampling algorithms, and implications for statistical learning, especially in situations requiring concentration inequalities or sub-Gaussian analysis.

1. Definition and Core Properties

Let $X \sim \mathcal{N}(\mu, \sigma^2)$ and fix a threshold $a \in \mathbb{R}$ . The unilaterally truncated (lower truncated) Gaussian distribution is the law of $X$ conditional on $X \ge a$ . Its density is given by

$f(x;\mu,\sigma,a) = \begin{cases} \dfrac{1}{\sigma} \dfrac{\phi\left( \frac{x-\mu}{\sigma} \right)}{1 - \Phi \left( \frac{a-\mu}{\sigma} \right)}, & x \ge a \ 0, & x < a \end{cases}$

where $\phi(t) = \frac{1}{\sqrt{2\pi}} e^{-t^2/2}$ is the standard normal density, and $\Phi(t)$ its cumulative distribution. The normalizing constant is $Z = 1-\Phi(\alpha)$ with the standardized truncation point $\alpha = (a-\mu)/\sigma$ .

Key closed-form results:

Mean:

$\mathbb{E}[X \mid X \ge a] = \mu + \sigma \frac{\phi(\alpha)}{1-\Phi(\alpha)}$

Variance:

$\mathrm{Var}[X \mid X \ge a] = \sigma^2 \left[ 1 + \alpha \frac{\phi(\alpha)}{1-\Phi(\alpha)} - \left( \frac{\phi(\alpha)}{1-\Phi(\alpha)} \right)^2 \right]$

Moment Generating Function (MGF):

$M_{X_{[a, \infty)}}(\theta) = \exp\left( \theta\mu + \frac{1}{2} \theta^2 \sigma^2 \right) \frac{1 - \Phi(\alpha - \sigma\theta)}{1 - \Phi(\alpha)}$

2. Maximal Variance, Bounds, and Calibration

A key analytical result, established in "The Maximal Variance of Unilaterally Truncated Gaussian and Chi Distributions" (Petrella, 14 Nov 2025), is: $\sup_{\mu,\sigma} \mathrm{Var}[X \mid X \ge a] = (M-a)^2$ where $M$ is the fixed mean of the truncated distribution, and $a$ is the threshold. The supremum is achieved as the location parameter $\mu \to -\infty$ (at fixed $M$ and $a$ ), reflecting the fact that pushing the Gaussian far into the left tail—while calibrating $\sigma$ such that the mean remains fixed—maximizes variance. For fixed cutoff $a$ , the variance can always be expressed in terms of $M$ , $a$ , and $\mu$ .

Approximations are used for parameter inference and calibration:

For scaled shift $U \equiv (\mu-a)/(M-a)$ over $U \in [-10,0.9]$ ,

$\frac{\sigma}{M-a} \approx \sqrt{2 - \exp \left[-\alpha(1-U)^\beta - U \right]}$

and

$\frac{\mathrm{Var}}{(M-a)^2} \approx 1 - \exp \bigl[-\alpha(1-U)^\beta \bigr]$

with $(\alpha, \beta)$ explicit polynomials in $U$ (see Table VII of (Petrella, 14 Nov 2025)).

Moment-intersecting and point-slope methods allow highly precise parameter recovery, typically outperforming naive least-squares. Methods converge rapidly (relative errors $<10^{-7}$ observed). Calibration workflows can thus achieve accurate matching of empirical mean-variance pairs to UTGD parameters.

3. Efficient Sampling Algorithms

Efficient simulation from UTGDs is critical for probabilistic modeling, inference, and Monte Carlo methods. A suite of optimized algorithms are established in "Fast simulation of truncated Gaussian distributions" (Chopin, 2012):

For univariate UTGDs, a table-based accept-reject method achieves $>0.99$ acceptance probability for $a$ in $[-2, 6]$ and remains highly efficient even for large- $a$ . The approach partitions $[a,\infty)$ into rectangles, leveraging precomputed tables for function evaluations, thereby minimizing floating-point overhead.
Expected operations per sample: $O(1)$ , with most draws requiring only a uniform variate, two multiplications, and a comparison.

For bivariate and higher dimensions, stratified accept–reject proposals (S $^+$ , S $^-$ , M $^+$ , M $^-$ ) or block Gibbs strategies are used. For example, in bivariate semi-infinite truncation, the acceptance rate always exceeds $0.5$ for all admissible parameterizations.

For conditional UTGD sampling in graphical models or Gibbs samplers (e.g., in RTGGM architectures), the standard inverse CDF method or these accept-reject samplers are typically used. All proposals remain efficient when parameters or truncation change dynamically.

4. Parameter Estimation and Statistical Inference

UTGDs present unique challenges in parameter estimation, especially under unknown truncation points or multi-parameter settings. "Efficient Truncated Statistics with Unknown Truncation" (Kontonis et al., 2019) addresses the problem of inferring $(\mu,\sigma,a)$ from i.i.d. UTGD samples:

The maximum likelihood (ML) landscape is non-concave in general, motivating a two-stage procedure:
1. Set recovery (support estimation) via Hermite polynomial expansions.
2. Parameter extraction by recasting as a convex optimization (in re-parameterizations such as $u=\mu/\sigma^2$ , $B=1/\sigma^2$ ).

Alternatively, moment-based estimators solve the system defined by the empirical mean and variance of the samples against the UTGD closed-form moment expressions. For $N = O(1/(\alpha\epsilon^2) \log(1/\delta))$ samples, one can reconstruct the parameters to arbitrarily small $\epsilon > 0$ , leveraging strong convexity in 1D and analytic gradients.

In the context of state-space models (e.g., process/measurement noise partially observed due to one-sided constraints), EM algorithms are adapted for truncated Gaussian process and measurement noise (González et al., 25 Jul 2025). This involves moment-matching at the M-step, and the use of Monte Carlo or particle smoothers to estimate sufficient statistics under truncated innovation processes.

5. Applications in Graphical Models and Machine Learning

UTGDs naturally arise in graphical models with non-negativity or rectification constraints, as in Restricted Truncated Gaussian Graphical Models (RTGGM) and related deep learning architectures (Su et al., 2016):

In bipartite Gaussian graphical models, imposing non-negativity by replacing hidden variables with UTGDs results in conditionally independent, tractable univariate truncated distributions.
The conditional mean of a UTGD, as a function of its natural parameter, provides a smoothed ReLU activation: $\mu_T(\xi, \lambda^2) = \xi + \lambda \frac{\phi(\xi/\lambda)}{\Phi(\xi/\lambda)}$ , converging to $\max(0, \xi)$ as $\lambda^2 \to 0$ .
Deep extensions allow parameter sharing and unsupervised pre-training for feedforward ReLU networks using UTGD-based conditional means, enabling the transfer of inference mechanisms from probabilistic models to deterministic neural architectures.

Contrastive divergence training of RTGGMs requires repeated sampling and expectation computation over UTGDs. The efficient sampling schemes from (Chopin, 2012) and closed analytic formulas for moments facilitate tractable learning.

6. Sub-Gaussianity, Concentration, and Variance Proxy

Applications of concentration inequalities or risk bounds frequently demand a sub-Gaussian variance proxy. UTGDs, while sub-Gaussian, are not strictly sub-Gaussian unless truncated symmetrically about the mean (Barreto et al., 13 Mar 2024):

For $X_{[a,\infty)}$ , the optimal sub-Gaussian variance parameter is $s_{\rm opt}^2 = \sigma^2$ , which always exceeds the true variance of the truncated distribution, except in symmetric truncations.
The variance proxy informs sharp concentration inequalities:

$\Pr\bigl(X_{[a,\infty)} - \mathbb{E}[X_{[a,\infty)}] \ge t\bigr) \le \exp\left( -\frac{t^2}{2 \sigma^2} \right), \quad t \ge 0$

regardless of the truncation point $a$ or mean shift.

A plausible implication is that in applications requiring finely tuned risk or uncertainty quantification (e.g., safety-critical control), the difference between the true variance and sub-Gaussian proxy must be accounted for, unless symmetric truncation can be assumed.

7. Practical and Computational Considerations

The computational tractability of UTGDs is foundational to their practical use:

Aspect	Closed-Form?	Computational Feature
PDF, CDF	Yes	Single/bi-dimensional integrals
Mean, Variance	Yes	Requires evaluation of $\phi$ , $\Phi$
Higher Moments/MGF	Yes	Analytic or via numerical integration
Sampling	Yes (algorithms)	O(1) per sample (Chopin, 2012)
Parameter Estimation	Yes	Convex Moment/ML methods (Kontonis et al., 2019)
Sub-Gaussian Proxy	Yes	Explicitly $\sigma^2$ (Barreto et al., 13 Mar 2024)

Efficient numerical routines for the normal CDF and its inverse are essential for both simulation and estimation. Precomputing and storing tables, as in (Chopin, 2012), further accelerates sampling. For higher-dimensional truncations, block-Gibbs or stratified rejection methods are most viable when the problem structure (e.g., Markov random field sparsity, parameter sign patterns) permits.

Explicit recognition of upper variance bounds (Petrella, 14 Nov 2025) and robust approximation formulae enables direct model calibration and identifiability in practical inverse problems, especially under limited or censored data.

References

(Petrella, 14 Nov 2025) The Maximal Variance of Unilaterally Truncated Gaussian and Chi Distributions
(Chopin, 2012) Fast simulation of truncated Gaussian distributions
(Kontonis et al., 2019) Efficient Truncated Statistics with Unknown Truncation
(Su et al., 2016) Unsupervised Learning with Truncated Gaussian Graphical Models
(González et al., 25 Jul 2025) Truncated Gaussian Noise Estimation in State-Space Models
(Barreto et al., 13 Mar 2024) Optimal sub-Gaussian variance proxy for truncated Gaussian and exponential random variables