General Preferential Attachment Trees
- General Preferential Attachment Trees are random trees grown via a sequential process where new nodes attach to existing ones based on a preference function of node degree or attributes.
- They generalize the Barabási–Albert model by allowing rich attachment kernels that produce varying degree distributions, including power-law, stretched-exponential, and condensation regimes.
- Efficient sampling techniques and analytic methods enable simulation studies and statistical estimation, facilitating insights into scaling limits, local weak limits, and phase transitions.
A general preferential attachment (PA) tree is a random tree grown via a sequential stochastic process in which each new node attaches to existing nodes according to probabilities determined by a user-specified preference or attachment function of node degree (or more generally node attributes such as “strength”). This class of models generalizes the canonical Barabási–Albert (BA) tree, allowing rich attachment kernels, inhomogeneity, directionality, multiple edges, and nontrivial seed graphs. The resulting structures exhibit a broad range of asymptotic behaviors, including phase transitions in degree distribution, local weak limits with size-biasing phenomena, and universal scaling limits in various metric topologies (Atwood et al., 2014, Gao et al., 2017, Garavaglia et al., 2022, Yuan et al., 2023).
1. Model Definitions and Variants
A general PA tree is formally specified by:
- Discrete-time process: starting from a “seed” graph (often a finite tree), at each time step , add a new node and connect it via one or more edges to existing nodes.
- Attachment probabilities: Each new edge from chooses a target node in the current tree with probability proportional to a preference function:
for some nonnegative function (the “attachment kernel”). In many models, for yields linear preferential attachment (with initial attractiveness), while with (sublinear), (linear), or (superlinear) yields regimes with distinct limiting behaviors (Atwood et al., 2014, Gao et al., 2017, Betken et al., 2018).
Extensions:
- Multiple or random (iid) outdegrees per vertex at birth, supporting fixed, Poisson, or arbitrary law for new edges per step—see the RPPT framework (Garavaglia et al., 2022).
- Directed networks and weighted networks, where the preference function may depend on in/out degrees or strengths (Yuan et al., 2023).
- Nontrivial seed graphs and arbitrary initial attribute assignments.
2. Asymptotic Degree Distributions and Regimes
The choice of preference function is pivotal:
- Linear (): generates power-laws in the degree sequence, with exponent (undirected, edge per step) (Atwood et al., 2014, Gao et al., 2017, Brightwell et al., 2010, Yuan et al., 2023). For the BA-parameterization (), and the empirical fraction of nodes with degree converges to $2/[k(k+1)(k+2)]$.
- Sublinear (, ): the degree distribution decays with a stretched-exponential cutoff; no scale-free behavior and no “hubs” (Betken et al., 2018, Gao et al., 2017, Atwood et al., 2014).
- Superlinear (, ): a “condensation” or “winner-takes-all” phenomenon arises, with one or few vertices capturing a positive fraction of the edges (“star” regime) (Atwood et al., 2014, Gao et al., 2017).
- General degree outdegree: For trees with i.i.d. outdegrees and arbitrary fitness , the limiting degree distribution is a Poisson-mixed model whose heavy-tail index is , where is the tail exponent of (Garavaglia et al., 2022).
The degree distribution can often be characterized through recurrence relations, master equations, or Markovian approaches, yielding explicit formulas or local limit theorems (Brightwell et al., 2010, Betken et al., 2018, Yuan et al., 2023).
3. Metric, Scaling, and Continuum Limits
General PA trees are amenable to continuum scaling limits in the Gromov–Hausdorff–Prokhorov topology:
- Under appropriate rescaling, discrete PA trees converge to random measured metric spaces, described via universal “line-breaking” or “block-gluing” constructions. Each node is replaced by a “block” (possibly with more complex internal geometry) and these are glued along the structural skeleton of the random tree, itself determined by the PA rule (Sénizergues, 2020).
- For linear-attachment and certain “plane” embeddings, associated “looptrees” (obtained by replacing each vertex with a cycle of matching degree) converge to the Brownian looptree, a quotient of the Brownian Continuum Random Tree (CRT), whose Hausdorff dimension is $2$ (Curien et al., 2014).
- Weighted, recursive, or split-tree representations facilitate analytic derivation of path lengths, height, and other functional statistics (Janson, 2017, Sénizergues, 2020).
4. Local Weak Limits and Size-Biasing
The random neighborhood around a typical vertex in a large PA tree converges weakly (in the local limit sense) to a multi-type branching process—the random Pólya Point Tree (RPPT); this object encodes a nuanced “size-bias” phenomenon:
- The root in RPPT has a degree given by , with explicit formulas for in terms of the age of the root and the underlying process parameters (Garavaglia et al., 2022).
- The distribution of the degree of an “older neighbor” or “younger child” displays shifts in the heavy-tail exponent by due to size-biasing, reflecting subtle local dependencies induced by the PA dynamic.
This universal local convergence holds for PA models with general iid outdegree distributions and fitness parameters, and extends to degree-infinite-variance cases.
5. Equivalence with Random Split Trees
For linear PA trees (), the global random structure is equivalent, in law, to a random split tree with infinite-branching and split vector distributed according to or, equivalently, the two-parameter Poisson-Dirichlet distribution. The split-tree framework permits transfer of path length, height, and depth profile results from recursive tree theory to PA trees (Janson, 2017).
Structurally, this correspondence enables fixed-point equations for global statistics like the sum over all pairs of nodes of the number of common ancestors, which can be explicitly solved in terms of the split vector parameters.
6. Efficient Generation and Statistical Estimation
The efficient sampling of large PA trees with arbitrary attachment kernels is accomplished by augmenting binary heap or balanced binary tree data structures to enable -time updates and proportional-to-weight sampling in each insertion step. Implementations (such as the “quicknet” and “wdnet” packages) harness these ideas for trees up to nodes and furnish interfaces for weighted, directed, or multiple-edge per step models (Yuan et al., 2023, Atwood et al., 2014).
Estimation: Empirical estimators for the unknown preference function can be formulated in terms of ratios of degree frequencies and cumulative attachment counts, with almost sure consistency and explicit convergence rates established via embeddings in supercritical continuous-time branching processes (Gao et al., 2017). Simulation studies confirm both consistency and the bias-variance trade-offs in estimation as functions of ’s growth.
7. Extensions: k-Trees, Clustering, and Higher-Dimensional PA Models
The combinatorial extension of PA trees to -trees integrates cliques and higher-order connectivity: in ordered increasing -tree models, each new node connects to all vertices of a chosen -clique, itself selected by an outdegree-weighted preferential rule. This yields tunable power-law exponents ($2+1/k$ for degree) and positive clustering coefficients, interpolating between classical PA trees () and dense random graphs for large (Panholzer et al., 2010).
A table summarizing core regimes:
| Attachment function | Degree distribution type | Limiting exponent (if any) |
|---|---|---|
| Power-law | ||
| , | Stretched exponential cutoff | \text{no power-law} |
| , | Condensation/star | one/few nodes dominate |
References
- (Atwood et al., 2014) "Efficient Network Generation Under General Preferential Attachment"
- (Yuan et al., 2023) "Generating General Preferential Attachment Networks with R Package wdnet"
- (Gao et al., 2017) "Consistent Estimation in General Sublinear Preferential Attachment Trees"
- (Betken et al., 2018) "Fluctuations in a general preferential attachment model via Stein's method"
- (Garavaglia et al., 2022) "Universality of the local limit of preferential attachment models"
- (Brightwell et al., 2010) "Vertices of high degree in the preferential attachment tree"
- (Curien et al., 2014) "Scaling limits and influence of the seed graph in preferential attachment trees"
- (Sénizergues, 2020) "Growing random graphs with a preferential attachment structure"
- (Janson, 2017) "Random recursive trees and preferential attachment trees are random split trees"
- (Panholzer et al., 2010) "Ordered increasing k-trees: Introduction and analysis of a preferential attachment network model"