Kingman Forests on Graphs
- Kingman forests are random spanning forests defined by merging tree roots connected by graph edges, generalizing the classical coalescent process.
- In Erdős–Rényi graphs, the forest structure varies from single spanning trees in dense cases to multiple trees in sparse regimes, leveraging insights from recursive trees.
- Coupling with uniform random recursive trees enables precise analysis of tree height, degree profiles, and edge counts, linking combinatorial methods with genealogical models.
Kingman forests are random spanning forests obtained from a generalized Kingman coalescent process defined on a finite graph , as opposed to the classical Kingman coalescent on a complete graph. In this construction, the allowed tree merging operations are governed by the edge structure of : two trees with respective roots may merge only if . The process proceeds by repeated allowed mergers until no further mergers are possible, yielding a random spanning forest—termed a "Kingman forest of " (Addario-Berry et al., 19 Sep 2025). This framework enables the paper of genealogical and combinatorial properties of coalescent processes on arbitrary graphs, with particular focus on Erdős–Rényi random graphs .
1. Kingman Coalescent on General Graphs
The Kingman coalescent on a graph is defined by restricting coalescence events to pairs of tree roots connected by edges in . Formally, the process starts with each vertex being a singleton tree, and at each step selects an edge connecting two distinct tree roots and merges the trees rooted at and into a single tree. This constraint generalizes the classical Kingman coalescent, which corresponds to being the complete graph . The forest-valued representation extends earlier work on coalescent random forests [Pitman, 1999], and was previously formalized for the complete graph [Addario-Berry & Eslava, 2018].
The end state—when no further allowed merges are possible—yields a random forest spanning all vertices of . When is connected and sufficiently dense, the resulting Kingman forest often consists of a single spanning tree; for sparse or structured graphs, a nontrivial spanning forest can result with multiple trees.
2. Kingman Forests on Erdős–Rényi Random Graphs
The case , the Erdős–Rényi random graph, is of particular interest, as the graph’s randomness and sparsity fundamentally alter the coalescent outcome. Throughout, denotes the edge probability and the number of vertices.
A key analytic achievement is establishing a connection between the Kingman coalescent on and uniform random recursive trees. Specifically, by relabelling the resulting trees of the Kingman forest with a suitable mapping function , each tree in the forest can be identified with a uniform random recursive tree—leveraging correct distributional properties (Addario-Berry et al., 19 Sep 2025). This coupling enables one to directly import established results regarding the height, degree profiles, and path lengths of the constituent trees from the literature on recursive trees (e.g., [Devroye 1987], [Pittel 1994], [Drmota 2009]).
3. Quantitative Results: Number and Structure of Trees
One principal result concerns the number of trees, , in a Kingman forest of . The analysis bifurcates across regimes for :
- For fixed , as , converges in distribution to an almost surely finite random variable. This characterizes the limiting landscape of the number of trees for dense Erdős–Rényi graphs.
- For the sparse regime with and , converges in probability to (Addario-Berry et al., 19 Sep 2025). This formula quantifies the expected number of trees in the Kingman forest arising from a sparse but sufficiently connected random graph.
The heights and sizes of the trees comprising Kingman forests are further analyzed, with the coupling to uniform random recursive trees allowing precise determination of their asymptotic distributions.
4. Coupling to Random Recursive Trees
Relabelling Kingman forests via the mapping ensures each component tree is distributed as a uniform random recursive tree. The recursive tree literature provides detailed extremal and structural results, such as limiting degree distributions, maximum degree, tree height asymptotics, and profile properties. These results include concentration phenomena and tail bounds (e.g., from [Chvátal 1979], [Motwani 1995]), which are adapted to analyze concentration of quantities like the edge count and tree heights in Kingman forests (Addario-Berry et al., 19 Sep 2025).
By establishing this coupling, structural questions can be recast in terms of well-understood combinatorics:
Object | Distribution under | Reference |
---|---|---|
Tree height | Limiting law for recursive trees | [Devroye 1987], [Drmota 2009] |
Maximum degree | Poisson-like asymptotics | [Pittel 1994], [Fuchs et al. 2006] |
Path length | Central limit and moment bounds | [Zhang 2015] |
Thus, the uniform recursive tree perspective solves multiple Kingman forest questions without needing bespoke analysis for each.
5. Probabilistic Tools and Concentration Results
The analysis of Kingman forests depends critically on probabilistic concentration inequalities and tail bounds, including Chernoff bounds and hypergeometric tails (see [Chvátal 1979], [Motwani 1995], [Durrett 2019], [Billingsley 2013]). These tools are invoked to establish high-probability concentration of key random variables in the edge-reveal process defining Kingman forests on . For example, the number of merged pairs at each step, denoted , is shown to be highly concentrated around its mean, leading to sharp predictions for the forest structure.
6. Connections to Genealogical Models and Applied Domains
Kingman forests generalize classical coalescent models central to population genetics, where the genealogy of sampled individuals is represented by tree-like structures. The applicability of Kingman forests includes modeling ancestral recombination graphs (ARGs), where allowed mergers correspond to biologically plausible recombination histories in structured or heterogeneous populations (cf. [Kingman 1982], [Fu 2006], [Kaj & Krone 2003], [Cousins et al. 2025], [Nielsen et al. 2025]). The edge constraint dictated by may represent geographic, social, or genetic incompatibilities.
A plausible implication is that varying the underlying graph enables modeling new scenarios in population genetics where network constraints or population structure are encoded directly in the coalescent process. The behavior of Kingman forests under different graph models—such as inhomogeneous random graphs [Van der Hofstad 2024], [Bogerd et al. 2020]—is an active research area.
7. Relations to Scaling Limits and Future Research Directions
The paper of Kingman forests interfaces with work on scaling limits in random graphs and minimal spanning trees, such as [Aldous 1997], [Addario-Berry et al. 2017], and [Goldschmidt et al. 2017], which analyze global tree structure in sparse regimes. Techniques developed for Kingman forests might carry over to related models, notably additive and multiplicative coalescent processes on general graphs. Open questions include the behavior of Kingman forests under highly inhomogeneous graphs, modifications of the coalescent rules, and connections to Brownian excursion-based scaling limits.
In summary, Kingman forests offer a unifying framework for analyzing genealogical, combinatorial, and probabilistic properties of coalescent-type processes on general graphs, with substantial connections to population genetics, random tree theory, and random graph scaling limits. The coupling to uniform recursive trees and concentration results greatly facilitate their analytic tractability and open further research in structured coalescent processes and their applications (Addario-Berry et al., 19 Sep 2025).