Poisson–Kingman Class: Foundations & Applications
- Poisson–Kingman class is a foundational family of random probability measures derived by normalizing subordinators' jumps, crucial in species sampling and Bayesian inference.
- Generalizations like the PK^(r) and PD_α^(r) classes introduce negative binomial mixing and trimmed processes to model clustering and overdispersion.
- Computational methods such as Gibbs samplers and stick-breaking constructions enable efficient posterior inference in complex infinite-dimensional models.
The Poisson–Kingman class is a foundational family of random probability measures and associated exchangeable partitions generated by normalizing the ranked jumps of subordinators with general Lévy measures. This class encapsulates, as special cases, key models such as the Dirichlet process, the two-parameter Poisson–Dirichlet (Pitman–Yor) process, the normalized generalized gamma process, and their extensions, and is pivotal in Bayesian nonparametric theory, partition structures, and infinite-dimensional diffusion models. Through negative binomial and polynomial mixing or tilting, the Poisson–Kingman class admits a robust generalization, resulting in a spectrum of families—such as the generalized PK class—governing increasingly complex species sampling, clustering, and diffusion phenomena.
1. Foundational Construction and Formal Definition
The canonical Poisson–Kingman law PK() is constructed starting from a Poisson point process on with Lévy density , which satisfies conditions ensuring almost sure finiteness and positivity of the sum of its points. Given a sequence of ordered points (jumps) , the total mass is almost surely finite. The normalized sequence takes values in the Kingman simplex . The law of this sequence depends only on , and is denoted PK() (Ipsen et al., 2018). Equivalently, if 0 is a driftless subordinator with Lévy density 1, the ranked and normalized jumps up to time 2, 3, have law PK(4), with 5 yielding the standard PK(6) family (Ipsen et al., 2018). Conditioning on the total mass or introducing mixing via a random time change or tilting yields mixed or extended Poisson–Kingman models (James, 2010, Cerquetti, 2011).
2. Generalizations via Negative Binomial and “Trimmed” Processes
Negative binomial generalizations, PK7, extend PK(8) by substituting the Poisson process with a negative binomial point process parameterized by 9 (Ipsen et al., 2018, Ipsen et al., 2016). This process is characterized by its Laplace functional: 0 The ordered jumps 1 are normalized by their total sum 2, giving a sequence in 3 denoted PK4. Through a Poisson–Gamma mixture perspective, PK5 coincides with PK(6), where 7 is an independent Gamma(8) variable (Ipsen et al., 2018, Griffiths et al., 2024).
A key case arises from the trimmed α-stable subordinator, where one removes the largest 9 jumps and normalizes the remaining jumps (Ipsen et al., 2016). The resulting distribution, denoted PD0, generalizes Kingman's PD1 by controlling the deletion of largest jumps and is linked structurally to negative binomial point process representations.
3. Explicit Characterizations: Partition Laws, EPPF, and Stick-Breaking
The partition structure induced by sampling from PK(2) or its extensions is characterized by the exchangeable partition probability function (EPPF), given in the most general form: 3 where 4 is the Laplace exponent (Dolera et al., 2021, Ipsen et al., 2018). For the negative binomial class, EPPFs involve further integration over the gamma mixed measure (Griffiths et al., 2024): 5 with 6, and 7.
Stick-breaking representations exist in both classical and generalized settings. For the PK(8) (Gamma process), the residuals are i.i.d. Beta(9) given 0, with an unconditional Beta mixture (Ipsen et al., 2018). For PK(1), with 2, the residuals are Beta(3), independent and not depending on 4. In the case of PD5, the size-biased picks and residual fractions yield stick-breaking variables with Beta-type distributions and explicit truncations (Ipsen et al., 2016).
4. Key Special Cases: Dirichlet, Poisson–Dirichlet, Normalized Generalized Gamma
Important specializations include:
| Model | Lévy Density 6 | EPPF/Product Form |
|---|---|---|
| Dirichlet/PD(0,θ) | 7 | Classical Ewens: 8 |
| PD(α, θ) | 9 | 0 |
| Normalized GG | 1 | Product with V2 |
The two-parameter Poisson–Dirichlet law arises as PK3 with 4 and 5; its stick-breaking sequence has 6 Beta(7), and its EPPF is a function only of counts and model parameters (Ipsen et al., 2018, Cerquetti, 2011).
5. Sampling, Gibbs Samplers, and Computational Schemes
Efficient posterior inference for Poisson–Kingman and generalized class priors is enabled by both marginal and hybrid MCMC algorithms. Marginal samplers exploit auxiliary variable representations (e.g., Zolotarev integral for 8-stable cases), integrating out the infinite-dimensional RPM in favor of finite random partitions and auxiliary variables (Lomelí et al., 2014). Hybrid samplers, leveraging the surplus-mass representation, condition on finite occupied cluster masses and the surplus mass, realizing efficient Gibbs and slice updates (Lomeli et al., 2015). The two-variable Gibbs sampler for the generalized negative binomial PK class employs an auxiliary variable 9 with conditional density proportional to the EPPF integrand—permitting tractable cluster assignment and predictive updates, as well as efficient population genetic simulations via coalescent/ancestral tree reconstructions (Griffiths et al., 2024).
6. Dualities, Diffusions, and Partition-Valued Stochastic Processes
The Poisson–Kingman framework is central to the study of coagulation-fragmentation dualities and infinite-dimensional diffusions. Coagulation and fragmentation operators on PK partitions admit duality relations generalizing those of the two-parameter PD family (James, 2010). For instance, composition of bridges associated with independent subordinators yields identities such as 0, connecting PK models with differing parameters under transformation. Infinite-dimensional diffusions (e.g., two-parameter Poisson–Dirichlet diffusion, multiple Poisson–Dirichlet diffusions on generalized Kingman simplices) retain PK-derived distributions as unique stationary laws, connect naturally to marked partition structures, and generalize the neutral Wright–Fisher model to settings with both multiple marks and parameters (Griffiths et al., 2021, Costantini et al., 23 Feb 2026).
7. Applications and Significance
The Poisson–Kingman class constitutes the mathematical backbone for species sampling models, Bayesian nonparametric inference, clustering, and partition structures in probability theory. Its key features—tractable EPPF, diverse stick-breaking constructions, flexibility for modeling overdispersion and clustering, and robust computational schemes—render it essential for population genetics, mixture modeling, ecology, combinatorics, and infinite-dimensional probability. The class's deep structural identities, connections to coalescent theory, and ability to recover core defaults (Dirichlet, Pitman–Yor, and beyond) confer a unifying framework for discrete random probabilities and exchangeable partitions (Cerquetti, 2011, Lomelí et al., 2014, Ipsen et al., 2016).