Papers
Topics
Authors
Recent
Search
2000 character limit reached

Discrete Generalized Pareto Distribution

Updated 18 March 2026
  • Discrete Generalized Pareto Distribution (DGPD) is a probability model for integer-valued tail data, obtained by discretizing the continuous GPD.
  • It employs a three-parameter structure—location, scale, and shape—to capture heavy-tailed, exponential, or truncated behaviors in discrete data.
  • Advanced inference methods like MLE and bootstrap, along with extensions such as multivariate and zero-inflated models, validate its practical utility across various fields.

The discrete generalized Pareto distribution (DGPD) is a theoretically grounded, parameterized family of probability distributions defined on the nonnegative integers, designed primarily to model the tail behavior of integer-valued data. It arises via a direct discretization of the continuous generalized Pareto distribution (GPD), inheriting its flexible tail properties and playing a central role in discrete extreme value theory, multivariate extremes, and advanced count-data modeling with heavy tails, threshold exceedances, and regression structures.

1. Definition, Parameterizations, and Theoretical Justification

The univariate DGPD is supported on {0,1,2,…}\{0, 1, 2, \ldots\} or {u,u+1,…}\{u, u+1, \ldots\} for some threshold or location parameter uu. The standard definition uses the continuous GPD survival/cdf, yielding the following probability mass function for k≥0k \geq 0 (or x≥μx \geq \mu):

p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}

For ξ=0\xi=0, one obtains the discrete analogue of the exponential: p(k;σ,0)=e−k/σ(1−e−1/σ)p(k; \sigma, 0) = e^{-k/\sigma}(1-e^{-1/\sigma})

The three-parameter location-scale-shape form is: p(x;μ,σ,ξ)=[1+ξσ(x−μ)]−1/ξ−[1+ξσ(x−μ+1)]−1/ξ,x≥μp(x; \mu, \sigma, \xi) = \left[1 + \frac{\xi}{\sigma}(x - \mu)\right]^{-1/\xi} - \left[1 + \frac{\xi}{\sigma}(x - \mu + 1)\right]^{-1/\xi}, \quad x \geq \mu with p(x)=0p(x) = 0 for {u,u+1,…}\{u, u+1, \ldots\}0 and {u,u+1,…}\{u, u+1, \ldots\}1 required (Prieto et al., 2013).

The DGPD is the unique nondegenerate limit for discrete threshold exceedance distributions under natural domain-of-attraction and rounding conditions. If {u,u+1,…}\{u, u+1, \ldots\}2 is discrete and {u,u+1,…}\{u, u+1, \ldots\}3 for {u,u+1,…}\{u, u+1, \ldots\}4 in the maximum domain of attraction of a continuous GPD, then for high thresholds {u,u+1,…}\{u, u+1, \ldots\}5, the law of {u,u+1,…}\{u, u+1, \ldots\}6 converges uniformly to DGPD as {u,u+1,…}\{u, u+1, \ldots\}7 (Hitz et al., 2017, Aka et al., 24 Jun 2025).

The tail behavior is governed by the shape parameter {u,u+1,…}\{u, u+1, \ldots\}8:

  • {u,u+1,…}\{u, u+1, \ldots\}9: heavy-tailed (power-law) decay
  • uu0: discrete exponential
  • uu1: finite upper endpoint

In the multivariate case, the DGPD generalizes to the MDGPD, which retains threshold stability and tail dependence properties via spectral or geometric constructions (Aka et al., 24 Jun 2025).

2. Properties, Moments, and Special Cases

The DGPD cdf, survival, and hazard functions are available in closed form:

The uu5th moment exists if uu6 (uu7). In particular,

uu8

The mean and variance converge only for suitable uu9 (e.g., mean exists for k≥0k \geq 00).

Special cases include:

  • Geometric distribution: DGPD with k≥0k \geq 01
  • Discrete Lomax: two-parameter DGPD with k≥0k \geq 02

DGPD is tail-equivalent to both the continuous GPD (with a continuity correction) and generalized Zipf distribution for large scale k≥0k \geq 03, inheriting their tail indices (Hitz et al., 2017). It remains invariant under linear transformations and rounding operations, i.e., threshold stability.

3. Parameter Estimation and Inference Methods

Parameter estimation proceeds via two main methods:

  • Seed (μ-(μ+1) frequency) method: For data k≥0k \geq 04, k≥0k \geq 05 is set to the sample minimum. Relative frequencies at the smallest two counts are equated to the model pmf to solve numerically for k≥0k \geq 06 (Dzidzornu et al., 2020, Prieto et al., 2013).
  • Maximum Likelihood Estimation (MLE): The log-likelihood for observations k≥0k \geq 07 is

k≥0k \geq 08

Partial derivatives yield nonlinear score equations, solved numerically (e.g., using R’s optim() with Simulated Annealing) (Dzidzornu et al., 2020).

Bootstrap methods are used to estimate standard errors and form confidence intervals: resample, re-fit DGPD, and use empirical quantiles of parameter estimates (Dzidzornu et al., 2020).

Goodness-of-fit is assessed via:

  • Discrete Kolmogorov–Smirnov statistics (with parametric bootstrap for discrete null)
  • Pearson’s chi-squared test with binned support and expected counts
  • Discrete Q–Q plots (Prieto et al., 2013, Hitz et al., 2017)

4. Flexible Extensions: Bulk, Tail, and Zero-Inflated Models

The DGPD is well-suited for modeling high-threshold exceedances, but becomes less reliable for lower thresholds or full-support modeling. Several flexible extensions have been proposed (Ahmad et al., 2024, Ahmad et al., 2022):

  • Discrete Extended GPD (DEGPD): Introduces a CDF transformation k≥0k \geq 09: x≥μx \geq \mu0 where x≥μx \geq \mu1 is usually x≥μx \geq \mu2, truncated Normal, or truncated Beta; x≥μx \geq \mu3 tunes the lower-tail or bulk behavior, and ordinary DGPD is recovered when x≥μx \geq \mu4.
  • Zero-Inflated DGPD (ZIDEGPD): For excess zeros,

x≥μx \geq \mu5

with the bulk handled as above; x≥μx \geq \mu6 is the structurally zero probability (Ahmad et al., 2024, Ahmad et al., 2022).

  • Smooth Transition Models: Bypass the need to preselect a threshold x≥μx \geq \mu7 and allow parameters to vary as functions of covariates within a penalized likelihood regression (GAM) framework, using spline expansions and link functions for all DGPD (or DEGPD) parameters (Ahmad et al., 2022).

For bulk+tail and ZI models, maximum likelihood estimation is used, often with numerical optimization and BIC/AIC-based model selection; the DEGPD with x≥μx \geq \mu8 indicates that classical DGPD is appropriate. Real data demonstrate that DEGPD and extensions outperform Poisson/NB and classical DGPD in bulk+tail or low-threshold scenarios (Ahmad et al., 2024, Ahmad et al., 2022).

5. Multivariate Discrete Generalized Pareto Distributions

The MDGPD extends the DGPD to vectors of discrete exceedances over high thresholds (Aka et al., 24 Jun 2025). The standard MDGPD is defined through the law of x≥μx \geq \mu9 with the following properties:

  • p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}0 is geometric,
  • p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}1 is independent and supported on p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}2.

Simulation draws on the generator representation: for i.i.d. p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}3, p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}4, p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}5, p(k;σ,ξ)=(1+ξkσ)−1/ξ−(1+ξ(k+1)σ)−1/ξ,σ>0,ξ∈Rp(k; \sigma, \xi) = \left(1 + \frac{\xi k}{\sigma}\right)^{-1/\xi} - \left(1 + \frac{\xi (k+1)}{\sigma}\right)^{-1/\xi}, \quad \sigma > 0, \xi \in \mathbb{R}6.

Bootstrap and likelihood-free inference in higher dimensions utilizes neural Bayes risk minimization: generate synthetic data under random parameter draws, train neural networks to predict parameter vectors, and minimize squared error over large simulation batches (Aka et al., 24 Jun 2025).

Multivariate DGPD preserves tail dependence, threshold invariance, and can accommodate flexible marginal and dependence structures, as illustrated in the modeling of spatial dry-spell extremes.

6. Applications and Empirical Performance

DGPD and its extensions have been empirically validated in diverse contexts:

  • Non-life insurance claims: DGPD systematically outperforms negative binomial models by orders of magnitude in AIC/BIC across both yearly and aggregated datasets, accurately capturing overdispersion and heavy claim frequency tails (Dzidzornu et al., 2020).
  • Road accident blackspots: DGPD and its discrete Lomax (DLomax) special case provide parsimonious parametric fits for counts per site; bootstrap-based discrete KS tests confirm adequacy in practical datasets (Prieto et al., 2013).
  • Discrete extremes: DGPD fits rare-event tails in Poisson simulations, word-frequency, tornado clusters, and multiple-birth data more accurately than continuous GPD (even with continuity corrections), demonstrating the necessity for discrete-tail modeling (Hitz et al., 2017).
  • Flexible regression and full-support modeling: ZIDEGPD–GAM and DEGPD outperform zero-inflated negative binomial GAMs in avalanche, doctor-visit, and complaint-count datasets, capturing both zero inflation and tail behavior robustly (Ahmad et al., 2022, Ahmad et al., 2024).
  • Multivariate extreme value analysis: MDGPD has been applied in spatial extremes such as Swiss dry-spell analysis, utilizing neural likelihood-free inference and exhibiting strong empirical fit and tail dependence estimation (Aka et al., 24 Jun 2025).

7. Practical Considerations, Limitations, and Outlook

The DGPD and its extensions offer several advantages:

  • Asymptotically justified for integer exceedances,
  • Closed-form pmfs and tractable parameterizations,
  • Tail flexibility, threshold invariance, and multivariate analogues,
  • Robust bootstrap-based confidence estimation and diagnostics.

Limitations include:

  • Need for custom numerical optimization; no widely available off-the-shelf implementations exist (although some R packages provide basic versions for recent extensions),
  • In small data or nearly homogeneous samples, the three-parameter model may offer minimal advantage,
  • Model extensions introduce additional identifiability challenges when bulk/zero parameters are weakly identified (Ahmad et al., 2024, Ahmad et al., 2022).

A plausible implication is that, while DGPD is an essential tool for discrete extreme value modeling, extended models such as DEGPD and ZIDEGPD, as well as MDGPD for multivariate inference, are necessary when modeling the full distribution, handling lower thresholds, or addressing zero inflation. Robust threshold selection and diagnostics remain active areas of methodological research.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Discrete Generalized Pareto Distribution (DGPD).