Beta & Poisson-Dirichlet Coalescents
- Beta and Poisson-Dirichlet Coalescents are stochastic models capturing genealogies with multiple simultaneous mergers under skewed offspring distributions.
- They employ Beta and Poisson-Dirichlet laws to quantify external branch lengths, collision events, and extreme block sizes, crucial for non-neutral evolution studies.
- Rigorous analytical methods, such as moment recursion and renewal approximations, underpin precise scaling limits and applications in evolutionary genetics.
Beta and Poisson-Dirichlet Coalescents are fundamental objects in the theory of exchangeable random partitions and stochastic processes describing genealogies in populations with highly skewed offspring distributions. These coalescent processes generalize the classic Kingman coalescent by allowing for multiple mergers (“simultaneous collisions”) of ancestral lineages, with particular focus on the Beta and Poisson-Dirichlet (PD) distributional classes which characterize different regimes of coalescent behavior. Their study underpins a rigorous framework for modeling genealogies under selection, large offspring variance, and various non-neutral evolutionary mechanisms.
1. Definitions and Structural Properties
A -coalescent is an exchangeable Markov process on the partitions of a finite set (e.g., ), whose transitions allow blocks to merge at once, with when there are blocks present. The rate at which a given -tuple merges is
Here, is a finite measure on . For the Beta-coalescent, .
Kingman's coalescent corresponds to . The Bolthausen–Sznitman coalescent uses ; the Beta-coalescent generalizes this, yielding a spectrum of behaviors depending on parameters .
The Poisson–Dirichlet (PD) coalescents correspond to a class of -coalescents (further generalizations of -coalescents), with parameterizing random frequencies of blocks via stick-breaking or Poisson–Kingman constructions. In certain sampling limits, the finite-dimensional distributions of block sizes converge to a PD law (Siri-Jégousse et al., 2013).
2. Asymptotic Behavior and Scaling Limits
Under Beta-coalescents for , critical asymptotic results concern the external branch length (the time a singleton persists before coalescence), the total external tree length , and the total number of collisions.
External branch length: As ,
where is a random variable with explicit density and tail behavior,
The number of collisions until absorption of a leaf, properly rescaled, converges in law to Beta (Dhersin et al., 2012).
Total external length: For ,
with variance and covariance structure detailed via explicit asymptotic expansions. The ratio to total tree length converges in probability to , sharply contrasting the Kingman regime (Dhersin et al., 2012).
Largest block and minimal clade: The largest block at deterministic time scale or at exhibits extremal Gumbel-type limit theorems. The minimal clade size (block containing a fixed element at its coalescence time) has a heavy-tail, decaying as (Siri-Jégousse et al., 2013).
Total number of collisions: For Beta-coalescents, , where , , and is the spectrally negative $1$-stable law; in the general Beta coalescent with $0-stable law (Gnedin et al., 2012).
3. Poisson-Dirichlet Coalescents and Model Construction
Coalescent processes associated with Poisson–Dirichlet laws arise naturally in two distinct ways:
- As limit partitions at fixed time: For the Beta-coalescent with , the ranked block frequencies at fixed converge, as , to a PD partition (Siri-Jégousse et al., 2013).
- From size-biased stick-breaking sampling: The two-parameter PD distributions arise as the limit collision measure in discrete-time -coalescents constructed via sampling points from a normalized Pareto random partition, with possible size-biasing. The precise limiting regime depends on and (Huillet, 2013), as summarized below:
| Range | Limiting Process | Time Scaling |
|---|---|---|
| Discrete-time PD -coalescent | None | |
| Beta -coalescent | Speed up by | |
| Kingman coalescent | Speed up by or |
In forward-time Poisson branching-selection models, the genealogical tree is equivalent to these coalescent structures, providing a direct evolutionary mechanism for the emergence of such partitions (Huillet, 2013).
4. Genealogical and Population Genetics Interpretation
In constant-population models where offspring distribution is heavy-tailed (e.g., marine species, viral populations), Beta-coalescents and Poisson–Dirichlet coalescents more accurately capture the probability of large family sizes and multiple mergers. The genealogical interpretation, supported by explicit Markov chain construction and limit theorems, connects these coalescents to evolving branching populations with selection mechanisms parameterized by offspring fitness and size-biased sampling (Huillet, 2013).
Implications include:
- Neutrality tests (e.g., Fu–Li’s ): Statistics like external/total length ratio are shifted under Beta-coalescent genealogies; the presence of many singletons may indicate skewed offspring variance rather than demographic events (Dhersin et al., 2012).
- Correlation structure: For , positive correlation between external branch lengths increases variance in mutation counts, affecting inference procedures.
- Limiting distributions: Power-law tails and Gumbel-type extremal behaviors dominate functionals such as minimal clade size and largest block at small times (Siri-Jégousse et al., 2013).
5. Methodologies and Limit Theorems
Analytical results for Beta and PD-coalescents rest on several advanced methodologies:
- Moment recursion: Precise asymptotics for mean and variance of external branch lengths and total lengths via recursive equations and linear expansion methods (Dhersin et al., 2012, Dhersin et al., 2012).
- Renewal approximation and Wasserstein control: For total collisions and tree-length, renewal-type arguments with quantification via Wasserstein distances facilitate transfer of stable limit theorems from approximating random walks to actual coalescence statistics (Gnedin et al., 2012).
- Paint-box and exchangeability: The characterization of block frequencies as draws from PD distributions relies on exchangeability and Kingman’s paint-box construction (Siri-Jégousse et al., 2013).
- Tauberian and coupling techniques: Laplace transform and Tauberian methods are critical for deriving tail behaviors, especially for minimal clade statistics (Siri-Jégousse et al., 2013).
- Forward-in-time population models: Simulation of fitness-dependent Poisson point processes with truncating selection directly yields genealogies corresponding to limiting Beta or PD-coalescents (Huillet, 2013).
6. Connections to Other Classes and Regimes
Beta and PD-coalescents provide a spectrum of models, interpolating between Kingman’s binary-only regime (), Bolthausen–Sznitman’s multiple-merge regime (), and intermediate multiple-merger regimes (). The Poisson–Dirichlet framework includes the two-parameter family , with additional parameter introducing further flexibility. As , tail heuristics and empirical statistics transition sharply, with many functionals exhibiting phase changes in their scaling exponents and correlation structure (Dhersin et al., 2012, Dhersin et al., 2012, Siri-Jégousse et al., 2013).
A plausible implication is that models in the Beta and PD class capture the “universality” class of coalescents arising in a wide range of natural and artificial selection regimes, particularly where reproduction is highly uneven.
7. Summary Table: Key Scaling and Laws
| Functional | Beta, | Law / Limit |
|---|---|---|
| Converges in law | Power-law tail () (Dhersin et al., 2012, Siri-Jégousse et al., 2013) | |
| converges | (Dhersin et al., 2012) | |
| In probability | (Dhersin et al., 2012) | |
| Number of collisions | Rescaled, converges | Beta or stable law (Dhersin et al., 2012, Gnedin et al., 2012) |
| Largest block | Extreme value limit | Gumbel-type (Siri-Jégousse et al., 2013) |
These results form the mathematical foundation for comprehensive modeling of genetic genealogies under heavy-tailed offspring distributions, large population models, and non-classical evolutionary dynamics.