Bivariate Hawkes Process

Updated 28 October 2025

Bivariate Hawkes process is a two-dimensional point process that models self- and cross-excitation to capture temporal clustering in event data.
Its formulation uses exponential kernels and branching representations, ensuring tractable stability and effective parameter estimation.
Applications span finance, neuroscience, and spatio-temporal modeling, offering practical insights for analyzing interacting event sequences.

A bivariate Hawkes process is a two-dimensional point process characterized by mutually exciting intensities, with each component influenced by both its own history (self-excitation) and by the other component (cross-excitation). This framework captures the temporal clustering and feedback mechanisms prevalent in applications such as finance, seismology, neuroscience, and network modeling. The bivariate variant is a foundational example of the broader class of multivariate Hawkes processes, offering tractable yet expressive modeling of interacting event sequences.

1. Mathematical Formulation and Structure

The general bivariate Hawkes process comprises a pair of counting processes, $N_1(t)$ and $N_2(t)$ , with conditional intensities

$\lambda_1^*(t) = \lambda_1 + \sum_{T_k^{(1)} < t} \mu_{1,1}(t - T_k^{(1)}) + \sum_{T_k^{(2)} < t} \mu_{2,1}(t - T_k^{(2)}),$

$\lambda_2^*(t) = \lambda_2 + \sum_{T_k^{(2)} < t} \mu_{2,2}(t - T_k^{(2)}) + \sum_{T_k^{(1)} < t} \mu_{1,2}(t - T_k^{(1)}).$

$\lambda_i$ is the baseline (immigrant) intensity for component $i$ .
$\mu_{j,k}(\cdot)$ is the excitation kernel governing the influence of events in process $j$ on the intensity of process $k$ .
Exponential kernels are commonly used: $\mu_{j,k}(t) = \alpha_{j,k} e^{-\beta_{j,k} t}$ , with $\alpha_{j,k} \ge 0$ , $\beta_{j,k} > 0$ .
The process is stationary if the spectral radius of the matrix $\Phi = (\phi_{j,k})$ , where $\phi_{j,k} = \int_0^\infty \mu_{j,k}(s) ds$ , is less than one (Laub et al., 2015, Laub et al., 17 May 2024).

In more general formulations, excitation kernels may be non-exponential, multiscale, or even non-monotonic (e.g., power-law kernels) (Batra, 19 Mar 2025). Recent research also covers nonlinear link functions, inhibition (negative-valued kernels), and more complex connectivity structures (Chen et al., 2017).

2. Branching and Immigration-Birth Representation

A key structural insight is the branching representation: each component, $N_i(t)$ , can be viewed as a superposition of an immigrant (background) process and a potentially infinite cascade of offspring generated by past events through specified kernels.

The generalized linear Hawkes model allows each "generation" of offspring to have a distinct kernel $I_n(t)$ :

$\lambda^{(n)}(t) = \int_{(0, t)} I_n(t - s) N^{(n-1)}(ds).$

Specializing to alternately assigned kernels (e.g., $I_n = h$ for odd $n$ , $I_n = g$ for even $n$ ) yields two aggregated processes, $N_{\text{even}}$ and $N_{\text{odd}}$ , which together form a bivariate process with mutually exciting structure:

$\lambda_{\text{even}}(t) = \mu + \int_0^t h(t-s) N_{\text{odd}}(ds), \qquad \lambda_{\text{odd}}(t) = \int_0^t g(t-s) N_{\text{even}}(ds).$

(Mehrdad et al., 2014)

This representation clarifies the conditions for stability (total offspring mean less than unity) and underpins limit theorems and deviation principles.

3. Limit Theorems and Asymptotic Analysis

Stability and long-term properties of the bivariate Hawkes process depend on kernel integrals and background intensity:

Stationarity: The process is stationary if the total mean branching matrix has all eigenvalues less than one:

$\rho(\Phi) < 1,\quad \Phi = \begin{bmatrix} \int \mu_{1,1} & \int \mu_{2,1} \ \int \mu_{1,2} & \int \mu_{2,2} \end{bmatrix} .$

Convergence to Equilibrium: For processes initialized without prior history and satisfying the norm-summability of excitation functions, shifted trajectories converge weakly or in variation to a stationary version (Mehrdad et al., 2014).
Large and Moderate Deviations: Asymptotic expansions for tail probabilities and large deviations have extensions to the bivariate setting, with joint cumulant generating functions governing rate functions and the mod- $\phi$ convergence method underpinning precise expansions (Gao et al., 2017).
Law of Large Numbers and Central Limit Theorem: Once equilibrium is established, standard limit theorems apply to increments, subject to technical integrability conditions.

4. Estimation and Inference Methodologies

Parameter estimation in bivariate Hawkes processes employs both parametric and nonparametric approaches:

CLS/VAR(INAR) Discretization: The process is discretized into bin-count sequences and modeled via an integer-valued autoregressive process, which can be mapped to a VAR( $p$ ) model. Conditional least squares estimates for parameters are rescaled by the bin width to recover kernel and baseline intensities. This method is shown to be consistent, asymptotically normal, and effective in practical LOB analyses (Kirchner, 2015).
MC-EM for Aggregated Data: For binned or latent event times, a Monte Carlo EM algorithm generates latent proposals for exact event times (superposed process plus allocation) and maximizes a weighted complete-data log-likelihood, subject to the stationarity constraint (Shlomovich et al., 2021).
Bayesian and Nonparametric Methods: Nonparametric Bayesian approaches use e.g., Gaussian process priors for the kernel matrix, often combined with a branching (cluster) structure to facilitate inference. Spike-and-slab or hierarchical priors enable connectivity/graph estimation and promote sparsity (Zhang et al., 2018, Sulem et al., 2021, Sulem et al., 2022).
Spectral Estimation: When the observed process is a superposition of Hawkes and noise (e.g., independent Poisson), parameter estimation can proceed via maximization of the spectral (Whittle) log-likelihood. Identifiability is established for various kernel structures under sufficient cross-excitation (Bonnet et al., 21 May 2024).
Empirical Applications: Credentialed methods have been used to uncover asymmetric self- and cross-excitation in LOB data, to estimate excitation structures in terrorism and ecological systems, and to provide robust inference for noisy or partially aggregated datasets (Kirchner, 2015, Zhou et al., 2022, Jun et al., 2022).

5. Applications in Finance, Insurance, and Spatio-Temporal Modeling

The bivariate Hawkes process has extensive applications due to its capacity to model mutual excitation and clustering:

Application Area	Process Role	Notable Phenomena Modeled
High-frequency financial markets	Models clustering of trades, order arrivals, and price changes. Buy/sell orders or up/down price moves are mapped to the bivariate process.	Microstructure noise, Epps effect, signed order flow impact (Mehrdad et al., 2014, Laub et al., 2015, Batra, 19 Mar 2025)
Risk/insurance	Aggregate claim arrivals modeled via one or more Hawkes processes (possibly with cross‐excitation), capturing contagion of claim events after catastrophic triggers.	Clustering in insurance losses, ruin probabilities (Mehrdad et al., 2014, Jang et al., 2020)
Spatio-temporal event modeling	Extension to bivariate (or matrix-valued) spatio-temporal models for terrorism/violence, epidemics, or ecological events; cross-triggering captures interplay between groups or processes.	Space-time dispersion, group rivalry, complex contagion (Jun et al., 2022)
Neurophysiology/network science	Two neuron spike trains or information propagation between two regions; mutual and asymmetric connectivity; non-linear and inhibitory kernels relevant.	Granger causality, inhibition, network sparsity (Chen et al., 2017, Sulem et al., 2021)

In financial LOBs, estimated excitation matrices reveal marked asymmetry: market orders can heavily excite limit order flow, while the influence in the reverse direction is negligible (Kirchner, 2015). Spatio-temporal models leverage explicit nonseparable and nonstationary triggering functions to reflect population-driven and location-dependent excitation (e.g., in analyses of Boko Haram and Fulani Extremist attacks in Nigeria) (Jun et al., 2022).

6. Simulation and Algorithmic Implementation

Simulation and inference with bivariate Hawkes processes can be achieved using:

Ogata's Modified Thinning: Candidate event times are proposed from an upper-bound intensity and accepted with probability proportional to the ratio of current to upper-bound intensity. Both intensities update in lockstep as new events in either process occur (Laub et al., 17 May 2024, Laub et al., 2015).
Immigrant–Birth/Branching Construction: Immigrants generate offspring (and descendants) according to prescribed kernels. This supports both analytical reasoning (e.g., via cluster size analysis) and simulation (Mehrdad et al., 2014, Laub et al., 2015).
Spectral/Periodogram-based Methods: For noisy datasets, the periodogram is computed for both auto- and cross-covariances, and the spectral likelihood is maximized over parameter space to estimate kernels, noise level, and base rates (Bonnet et al., 21 May 2024).
Inference in Aggregated Data: When observations are aggregated/binned (or contaminated by indistinguishable Poisson noise), simulation techniques assign latent event times consistent with the observed binned counts, often by simulation under a superposed Hawkes process with event allocation (Shlomovich et al., 2021, Zhou et al., 2022).
Nonparametric and Neural Extensions: Recent approaches employ continuous-time LSTMs or recurrent neural network architectures to generalize beyond additive, exponential kernels; these learn intensity functions directly from data history, enabling more flexible, non-additive dynamics but requiring more advanced training procedures and regularization (Mei et al., 2016).

7. Theoretical Extensions and Future Directions

Research on bivariate Hawkes processes is at the core of several active threads in the stochastic modeling literature:

Nonlinear and Inhibitory Extensions: Models now admit nonlinear link functions ( $\lambda(t) = \phi(\cdot)$ , $\phi$ monotone nondecreasing) and signed kernels, enabling inhibition and complex feedback (Chen et al., 2017, Sulem et al., 2021).
Precise Deviation Theory: Asymptotic expansions for tail probabilities and rare events now reach beyond exponential decay, with detailed corrections computed via mod- $\phi$ methodology, and extension to the bivariate case is attainable (Gao et al., 2017).
High-dimensional and Sparse Networks: Variational Bayes and spike-and-slab priors promote scalable, sparse inference of interaction graphs, with convergence guarantees for network recovery in high dimensions (Sulem et al., 2022, Sulem et al., 2021).
Quadratic (Nonlinear) Feedback: Multivariate quadratic Hawkes processes (MQHawkes) introduce quadratic feedback (trend-based or second-order kernels), capturing co-jumps and fat-tailed volatility distributions across assets; calibration relies on Yule–Walker-type equations derived for the multivariate case (Aubrun et al., 2022).
Robustness to Measurement Error: Spectral inference techniques address noise, aggregation, and missing data by leveraging second-order properties, and identifiability conditions have now been established for the bivariate exponential scenario (Bonnet et al., 21 May 2024, Zhou et al., 2022).

Plausible directions include further exploration of the interplay between statistical identifiability, kernel parametrization, and network sparsity in high-dimensional and nonstationary event systems; incorporation of covariate effects; and optimal control of self-exciting networks.

In summary, the bivariate Hawkes process constitutes the prototypical multivariate self-exciting point process, exhibiting both rich theoretical structure and breadth of application. Its mathematical formulation via mutually (and self-) exciting intensities, tractable stability and fluctuation theory, robust estimation methods, and practical relevance across finance, neuroscience, seismology, and risk modeling underpin a broad and dynamic research landscape (Mehrdad et al., 2014, Laub et al., 2015, Laub et al., 17 May 2024, Batra, 19 Mar 2025, Jun et al., 2022).