H-SGDLM: Heterogeneous Graphical Dynamic Linear Model
- H-SGDLM is a fully Bayesian multivariate time series model that couples individual dynamic linear models through a dynamically learned, sparse precision matrix.
- It integrates heterogeneous autoregressive components to capture endogenous temporal signals and exogenous cross-series influences, enhancing model interpretability.
- The model supports scalable, GPU-accelerated inference for high-dimensional financial modeling, enabling efficient portfolio optimization and variance decomposition.
The Heterogeneous Simultaneous Graphical Dynamic Linear Model (H-SGDLM) is a fully Bayesian, multivariate time series model which contemporaneously couples a universe of Dynamic Linear Models (DLMs) via a dynamic, sparsely parameterized precision matrix. H-SGDLM extends standard simultaneous graphical DLMs by incorporating heterogeneous autoregressive components, designed to capture both endogenous temporal dependencies (e.g., autoregressive signals) and exogenous cross-series influences (through a dynamically learned, sparse network structure). The approach supports efficient GPU-based inference and compositional variance decomposition, and underpins scalable procedures for high-dimensional, interpretable financial modeling, notably sparse mean-reverting portfolio construction via quasi-convex optimization and cyclical coordinate descent.
1. State-Space Formulation and Model Specification
H-SGDLM observes an -dimensional time series, typically log-prices . For each asset , the model posits a univariate DLM of the form
with observation noise and state noise .
The regressor vector is partitioned into:
- Endogenous terms (lags of returns, moving averages, leverage terms, and derived signals).
- Exogenous terms given by the real-time values of assets in a dynamically selected parent set .
The state vector encapsulates regression coefficients on endogenous () and exogenous () predictors, with conforming to conjugate Inverse Gamma prior structure.
To capture cross-asset dependencies, the DLMs are coupled via a sparse, time-varying exogenous coefficient matrix , forming a dynamic, non-symmetric graphical structure. The joint law at time is then
with implied precision and covariance
where .
2. Bayesian Filtering, Priors, and Sequential Inference
For each series , H-SGDLM maintains Normal–Gamma conjugate priors: Forecast and filtering through Kalman-type recursions yield:
with all hyperparameter updates following standard Normal–Gamma algebra.
3. Sparse Graphical Structure and Parent Selection
The cross-sectional (graph) structure is built through data-driven selection and shrinkage:
- For each , an empirical covariance or sparse Wishart filter identifies high-magnitude conditional dependencies, producing candidate parent sets .
- Candidate parents are dynamically promoted or demoted between core and down-set via their estimated signal-to-noise ratios.
- This yields a sparse adjacency matrix encoding current conditional relationships among assets, which is generally both directed and time-varying.
- Sparsity is induced by limiting parent-set sizes () and is dynamically adapted at each time step.
4. Computational Implementation and Scalability
The model is architected for efficient high-dimensional inference:
- All univariate DLM updates are independent conditional on parent-sets, enabling perfect parallelism—well-matched to GPU tensor operations.
- The only cross-series computation is the "recoupling" step, involving the assembly of , and determinant evaluations, typically handled with few-hundred Monte Carlo samples.
- GPU implementations with TensorFlow and sparse-pattern storage for achieve real-time updates for dimensions , , .
- Per-update complexity: parent selection (with thresholding), individual DLM update (), and inversion at by exploiting sparsity.
5. Mean-Reverting Portfolio Construction
A primary application is the construction of sparse, mean-reverting portfolios:
- At time , given H-SGDLM estimates (candidate predictive covariance, e.g., ) and empirical covariance , the following quasi-convex optimization is solved: interpreted as the “predictability difference” (Box), with sparsity arising from restricting to nonzero entries corresponding to a graph block of assets.
- The block is defined by the current H-SGDLM parent graph; further sparsity can be imposed by referencing nonzero entries in .
The optimization is efficiently solved via cyclical coordinate descent (CCD):
- For , the per-coordinate update for is: (normed after each sweep to enforce ).
- An -penalty can be incorporated for further sparsification via soft-thresholding.
- CCD exhibits geometric convergence; the objective is differentiable and quasi-convex with convergence guarantee by Tseng (2001, Thm 5.1).
CCD Pseudocode (as specified in (Griveau-Billion et al., 2019))
1 2 3 4 5 6 7 8 9 10 11 12 |
Input: M ∈ R^{n×n} positive‐definite, tol>0
Initialize x^(0) with ||x^(0)||_1=1
r=0
repeat
for i=1…n do
a = ∑_{j≠i} M_{ij}·x_j^(r)
x_i^(r+½) = −a/M_{ii}
end
x^(r+1) = x^(r+½)/||x^(r+½)||_1 # renormalize
r = r+1
until ||x^(r)−x^(r−1)||_2 < tol
Output: x^* |
6. Variance Decomposition and Interpretability
H-SGDLM enables granular variance decomposition for each forecast:
- The predictive mean for asset decomposes into endogenous and exogenous contributions:
- Summing absolute values of coefficient vectors vs.\ enables a time series of “endogenous vs.\ exogenous signal strength,” empirically found to be predictive of subsequent variance moves.
- Change-point statistics on these signals facilitate interpretable financial forecasting, including the construction of “directional” portfolios responsive to endogenous/exogenous regime shifts.
7. Empirical Performance and Applications
Empirical studies benchmark H-SGDLM and its variants on large universes of equities and derivatives:
- On 487 European stocks (2001–2019), one-step median absolute deviation in -realized variance is 0.013 for H-SGDLM, outperforming classical HAR-RV (0.012) and SGDLM (0.015).
- H-SGDLM captures large one-day-ahead moves with 63.89% coverage inside an 18.06%-wide forecast interval, compared to 53.24% (HAR-RV) and 34.69% (SGDLM). For positive jumps , H-SGDLM achieves 63.95% correctness.
- On S&P 500 data, out-of-sample accuracy for large jumps is 65.93% (vs. 54.21% for HAR-RV).
- Variance-decomposition-derived signals, when used for change-point-timed equal-weight portfolios, yield steadily increasing cumulative returns over multi-year periods including stress episodes such as the 2008 crisis.
- Thresholded signals deliver >60% directional forecasting accuracy for the STLFSI Financial Stress Index when applied to S&P 500 subsets.
- GPU acceleration yields real-time inference for up to 500 assets with parent connections and several hundred Monte Carlo recoupling steps per update.
8. Context, Significance, and Research Directions
H-SGDLM provides a unified framework for:
- Multi-horizon volatility modeling (via HAR-RV),
- Dynamic graphical network discovery (via time-varying sparse ),
- Fully Bayesian inference in high dimensions (via Normal–Gamma and Variational Bayes decoupling), and
- Interpretable variance attribution across endogenous/exogenous factors.
The approach overcomes the scalability, flexibility, and interpretability limitations of classical co-integration and factor models, facilitating real-time high-dimensional inference and portfolio construction for financial and econometric applications (Griveau-Billion et al., 2019, Griveau-Billion et al., 2019).
The method's capacity for interpretable decomposition, efficient implementation, and robust empirical performance underpins its adoption for risk management, mean-reverting portfolio discovery, and higher-order market structure analysis. Future extensions may involve alternative network learning strategies, alternative Bayesian shrinkage priors, or applications to non-financial multivariate time series.