Papers
Topics
Authors
Recent
Search
2000 character limit reached

Censored and Shifted Gamma (CSG) Distribution

Updated 4 April 2026
  • The CSG distribution is a parametric family used to model non-negative mixed outcomes, particularly 24-hour precipitation accumulations.
  • It applies a censored and shifted gamma framework to represent both the probability of zero rainfall and the distribution of positive amounts.
  • Parameter estimation via closed-form CRPS optimization and semi-local clustering enhances ensemble forecast calibration with statistical rigor.

The censored and shifted gamma (CSG) distribution is a parametric family designed to model non-negative, mixed discrete–continuous outcomes, most notably 24-hour precipitation accumulations. In operational ensemble forecast post-processing, the CSG distribution forms the core of a widely adopted ensemble model output statistics (EMOS) approach that enables direct calibration for both the probability of zero precipitation and the distribution of positive amounts. The CSG model delivers significant improvements in forecast skill and calibration, especially under operational constraints such as limited training data or the coexistence of dual-resolution ensemble sources (Szabó et al., 2022, Baran et al., 2015).

1. Mathematical Definition and Properties

A CSG random variable XCSG(α,β,δ)X \sim \mathrm{CSG}(\alpha, \beta, \delta) is defined as the maximum of zero and a shifted gamma random variable: X=max(0,Yδ)X = \max(0, Y - \delta), where YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta), and δ>0\delta > 0.

Probability Functions

Let gα,β(y)g_{\alpha,\beta}(y) denote the gamma density and Gα,β(y)G_{\alpha,\beta}(y) its cumulative distribution function (CDF): gα,β(y)=1Γ(α)βαyα1ey/β,y>0g_{\alpha,\beta}(y) = \frac{1}{\Gamma(\alpha)\,\beta^\alpha} y^{\alpha-1} e^{-y/\beta}, \quad y > 0

Gα,β(y)=0ygα,β(t)dtG_{\alpha,\beta}(y) = \int_0^y g_{\alpha,\beta}(t)\,dt

The CDF and PDF of XX are: $F_X(x) = \begin{cases} 0 & x < 0 \[6pt] G_{\alpha,\beta}(x + \delta) & x \geq 0 \end{cases}$

X=max(0,Yδ)X = \max(0, Y - \delta)0

Here, X=max(0,Yδ)X = \max(0, Y - \delta)1 is the point mass at zero precipitation, while for X=max(0,Yδ)X = \max(0, Y - \delta)2 the density is a shifted gamma, left-censored at zero.

Mean and Variance

Letting X=max(0,Yδ)X = \max(0, Y - \delta)3: X=max(0,Yδ)X = \max(0, Y - \delta)4 The second moment has analogous structure using incomplete-gamma identities; normalization and closed-form evaluation are ensured by construction (Baran et al., 2015).

2. Linking CSG Parameters to Ensemble Forecasts

Within the EMOS framework, the CSG's underlying gamma mean (X=max(0,Yδ)X = \max(0, Y - \delta)5) and variance (X=max(0,Yδ)X = \max(0, Y - \delta)6) are modeled as functions of the ensemble members. The canonical regression links are: X=max(0,Yδ)X = \max(0, Y - \delta)7 where X=max(0,Yδ)X = \max(0, Y - \delta)8 are ensemble member forecasts, X=max(0,Yδ)X = \max(0, Y - \delta)9 is the ensemble mean, and all coefficients are non-negative. The CSG parameters are recovered as: YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta)0 The shift YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta)1 is a further non-negative parameter. In the presence of exchangeable groups (e.g., high- and low-resolution ensemble subsets), group means replace individual member forecasts in the regression linkage.

In dual-resolution settings, the mean links as: YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta)2 where “H” and “L” index high- and low-resolution groups (Szabó et al., 2022).

3. Parameter Estimation via Proper Scoring Rules

CSG EMOS parameters YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta)3 are estimated by minimizing the mean Continuous Ranked Probability Score (CRPS) over a rolling training set: YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta)4 A closed-form expression of CRPS for the CSG distribution exists, utilizing incomplete-gamma functions, and is used to enable efficient direct numerical optimization (e.g., L-BFGS-B). This method is preferred to maximum likelihood estimation in typical operational settings, as it yields superior probabilistic performance. All parameters are box-constrained to maintain physical admissibility (non-negativity) (Baran et al., 2015, Szabó et al., 2022).

4. Semi-local Training and Clustering

To achieve a balance between spatial localization and statistical robustness, CSG EMOS commonly adopts a semi-local training strategy:

  • For each forecast initialization, a rolling 30-day training window is used.
  • Each land grid point is characterized by a 24-dimensional feature vector combining quantiles of both the climatological precipitation CDF and the recent raw-ensemble-mean error distribution.
  • K-means clustering (with YGamma(α,β)Y \sim \mathrm{Gamma}(\alpha, \beta)5) is applied to group grid points with similar climatology and error characteristics.
  • Parameter estimation for each cluster aggregates data from all its points (typically 1000–1500 cases), enabling more stable and regionally adaptive calibration.

This approach maintains parameter locality while pooling data to mitigate the sample size limitations inherent to short rolling training windows (Szabó et al., 2022).

5. Verification and Empirical Findings

CSG EMOS has been rigorously evaluated in operational and research contexts, notably on European dual-resolution ECMWF ensembles and regional ensemble systems.

Verification Metrics

  • Mean CRPS and CRPSS (skill score relative to raw ensemble)
  • Brier scores (BS) and skill scores (BSS) for preset thresholds (e.g., 0.1, 5, 10 mm)
  • Reliability diagrams
  • Statistical significance assessed via block-bootstrap and Diebold–Mariano tests with false-discovery-rate correction

Main Results

  • Raw dual-resolution ensembles exhibit under-dispersion and bias, with skill disparities apparent up to day 5.
  • CSG EMOS post-processing yields statistically significant CRPS reduction across all lead times (e.g., CRPSS ≈ 0.15 at day 1, ≈ 0.05 at day 5).
  • After CSG EMOS calibration, inter-configuration skill differences between dual-resolution mixtures are statistically insignificant.
  • Compared with quantile mapping (QM) and weighted QM—both requiring extensive historical reforecast data—CSG EMOS, trained on only 30 days of data, matches or slightly outperforms these alternatives in mean CRPS and Brier score at all time horizons.
  • Reliability is restored: at 0.1 mm thresholds, calibration is near-perfect even at lead time day 10; improvement is also observed at heavier rainfall thresholds despite data sparsity.

6. Distinctive Attributes and Operational Impact

Key advantages of the CSG EMOS framework in operational post-processing include:

  • Explicit probability mass at zero precipitation without recourse to mixture models.
  • Calibration and adjustment of ensemble forecast bias and dispersion through linear regression on ensemble statistics, not relying on discrete–continuous mixtures or additional covariates.
  • Unified parametric structure, reducing implementation complexity and fit instability.
  • Closed-form CRPS for rapid and robust practitioner adoption.
  • In comparative field tests, CSG EMOS demonstrates sharper, more calibrated forecasts and more accurate point predictions than both censored GEV EMOS and gamma BMA, particularly when considering forecast reliability and sharpness jointly.

These attributes render CSG EMOS a practical, computationally efficient, and statistically rigorous post-processing solution in contemporary ensemble forecast calibration, particularly advantageous in environments with limited historical reforecast archives or under dual-resolution computational constraints (Baran et al., 2015, Szabó et al., 2022).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Censored and Shifted Gamma (CSG).