Effective Branching Factor in Hawkes Models
- Effective branching factor is a measure that quantifies the average number of offspring events triggered by a single event in Hawkes processes.
- It is estimated using various models—from individual to pooled to hierarchical—to parse out endogenous versus exogenous influences.
- Bayesian hierarchical methods with edge-effect corrections yield more accurate predictions of cascade sizes compared to naïve pooled estimates.
The effective branching factor is a central quantitative descriptor of self-excitation in temporal point processes, particularly Hawkes processes. It encapsulates the expected number of direct offspring events (i.e., triggered by a single antecedent) and governs both the short-term dynamics and the potential for cascading bursts within event sequences. In hierarchical settings, where individual-level heterogeneity must be considered, its precise estimation enables improved inference of endogenous versus exogenous sources of activity and substantially alters interpretations of cascade potential, as demonstrated in the context of aggressive-behavior onsets in clinical populations (Potter et al., 16 Jul 2025).
1. Mathematical Definition of Branching Factor
The effective branching factor, denoted , is formally defined in the context of the Hawkes point process as the expected number of direct children (offspring) generated by a single event. For a general kernel , the branching factor is given by the integral
For the exponential kernel , this yields
Thus, under this kernel choice, the branching factor is directly given by the kernel magnitude parameter . At the individual level, . In pooled or population-level models, alternative summaries such as a mean branching factor or a hyperparameter posterior mean are used to aggregate across heterogeneous individuals (Potter et al., 16 Jul 2025).
2. Model Specification and Levels of Pooling
Potter et al. formalize the estimation hierarchy for the branching factor as follows:
- Individual Model: Each subject has , where is estimated from their data alone.
- Pooled Model: A global is shared across all individuals, yielding a single estimate for the population, but potentially obscuring subject-specific variability.
- Unpooled Summary: Aggregation by averaging the posteriors across subjects: .
- Hierarchical Model: A partially pooled specification where , with as the hyperparameter capturing the population-mean branching factor and its variability.
The hierarchical model is further characterized by weakly-informative hyperpriors (e.g., ), and priors structured to stabilize inference in the presence of sparsity and substantial between-individual heterogeneity (Potter et al., 16 Jul 2025).
3. Edge-Effect Correction in Branching Factor Estimation
Sessional boundaries and sampling truncation induce an "edge-effect" wherein initial events of each observed segment lack observed potential parents, leading to biased attribution of exogenous origin and inflated baseline rate . Potter et al. correct this by introducing an initial-intensity term at such that the modified conditional intensity is
with , where . This edge-effect correction prevents systemic misclassification of early events and reduces upward bias in branching factor estimates for truncated sequences (Potter et al., 16 Jul 2025).
4. Bayesian Inference and Posterior Summarization
Estimation of the effective branching factor within the hierarchical model is conducted via Bayesian inference using the No-U-Turn Sampler (NUTS), implemented in NumPyro. Four independent chains, each with 1,000 warmup and 1,000 sampling iterations, are employed. Diagnostics include:
- Gelman–Rubin statistic
- Effective sample size ( for key parameters)
- Absence of divergent transitions
- Rank-plots and low Monte Carlo SE
Posterior summaries are reported as mean posterior SD, together with High Density Intervals (HDIs). The population-level mean is indexed by the hyperparameter for the effective branching factor (Potter et al., 16 Jul 2025).
5. Empirical Estimates and Reduction in Cascade Size
Numerical results indicate substantial differences between pooling strategies:
- Pooled model:
- Unpooled (average):
- Hierarchical:
The hierarchical estimate is significantly lower and markedly more precise compared to the pooled estimator (Welch -test, ). Given the expected total descendants from a cascade is in the subcritical regime, the hierarchical model produces an expected cascade size of versus for the pooled model—a threefold reduction in predicted escalation per parent event (Potter et al., 16 Jul 2025).
| Model Type | Branching Factor (mean ± sd) | Expected Cascade Size |
|---|---|---|
| Pooled | ||
| Hierarchical | ||
| Unpooled | Not explicitly stated |
6. Sensitivity, Robustness, and Goodness-of-Fit
Robustness of the effective branching factor estimation is evaluated by:
- Power-scaling sensitivity analysis: Perturbing prior and likelihood by power and recalculating posterior distances using symmetrized metric. The unpooled model is highly sensitive for low-data individuals, while the hierarchical model remains stable (distances for ).
- Goodness-of-fit (GOF) assessments:
- Random Time Change Theorem (RTCT) residuals to check model adequacy via transformation to exponentially distributed inter-event times.
- Lewis test with Durbin’s modification: Null hypothesis that transformed residuals follow a unit-rate Poisson process is retained in of sessions for the partially pooled model.
- PSIS-LOO (Leave-One-Out) cross-validation: Hierarchical model achieves the highest expected log posterior density (ELPD), outperforming both pooled and unpooled models for out-of-sample prediction accuracy.
These findings establish that the hierarchical approach, in conjunction with edge-effect correction, produces estimates of the branching factor that are robust to sampling variability and model perturbation, and are preferred according to multiple GOF metrics (Potter et al., 16 Jul 2025).
7. Interpretative Significance and Research Applications
The effective branching factor, by encoding the self-excitation strength, directly influences the expected cascade size and thus the predicted scale of secondary event proliferation. By enabling sharper unmixing of endogenous (dynamically triggered) from exogenous (spontaneous or externally cued) sequences, rigorous estimation informs both mechanistic interpretation and operational interventions—such as early warning, individualized risk modeling, and resource allocation in clinical and behavioral monitoring contexts. A plausible implication is that overly optimistic cascade predictions resulting from naïve pooling may be systematically corrected by hierarchical modeling, preventing over-allocation of resources and improving targeting of interventions (Potter et al., 16 Jul 2025).