Papers
Topics
Authors
Recent
Search
2000 character limit reached

Variational Foresight Dynamic Selection (VFDS)

Updated 8 February 2026
  • VFDS is a Bayesian framework for time-varying, context-dependent variable selection that uses variational inference for scalable posterior and predictive analysis.
  • It dynamically models predictors' relevance in high-dimensional regression, yielding interpretable sparsity patterns and improved forecasting in economics and finance.
  • The algorithm employs coordinate-ascent variational Bayes with Polya–Gamma augmentation, achieving linear computational complexity and robust handling of abrupt shifts in predictor importance.

Variational Foresight Dynamic Selection (VFDS) is a Bayesian framework for time-varying, context-dependent variable selection in high-dimensional dynamic models. VFDS is designed to infer dynamically evolving predictive structures, such as changes in which input features or predictors are relevant at each time point, while maintaining computational scalability through variational methods. The framework achieves efficient posterior and predictive inference in time-varying parameter (TVP) regression models, offering robust dynamic variable selection and interpretable sparsity patterns foundational for forecasting in economics, finance, and other domains characterized by temporally dependent data (Koop et al., 2018, Bianco et al., 2023).

1. Problem Formulation and Model Structure

VFDS addresses the challenge where predictor relevance changes over time and across contexts, requiring a mechanism to “foresee” which variables are likely to be informative before observing all data. The model is based on a state-space TVP regression:

yt=j=1pβjtxj,t1+ϵt,ϵtN(0,σt2)y_t = \sum_{j=1}^p \beta_{jt} x_{j,t-1} + \epsilon_t, \quad \epsilon_t \sim N(0, \sigma_t^2)

with latent coefficients βjt\beta_{jt} that evolve as:

βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)

where γjt{0,1}\gamma_{jt} \in \{0,1\} is a time-varying binary inclusion indicator controlled by a Bernoulli–Gaussian (“spike-and-slab”) hierarchy:

γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)

The variances and hyperparameters (ηj2,ξj2,ν2\eta_j^2, \xi_j^2, \nu^2) may themselves be endowed with conjugate inverse-gamma priors (Bianco et al., 2023). This structure enables both smooth time evolution and abrupt switches in variable importance.

The observation noise variance may evolve with log-stochastic volatility:

σt2=exp(ht),h=(h0,,hn)Nn+1(0,ν2Q1)\sigma_t^2 = \exp(h_t),\quad h = (h_0,\dots,h_n)^\top \sim N_{n+1}(0, \nu^2 Q^{-1})

where QQ is the tridiagonal DLM precision structure encoding temporal regularization.

2. Bayesian Foresight and Dynamic Selection

VFDS operationalizes dynamic selection (“foresight”) by modeling the inclusion probabilities

πjt=P(γjt=1)=expit(ωjt)\pi_{jt} = P(\gamma_{jt}=1) = \mathrm{expit}(\omega_{jt})

with the latent process ωjt\omega_{jt} evolving as a Gaussian Markov random field (GMRF). This captures smooth persistence in the probability of each variable being active, while allowing context-aware, rapid switches—pockets of predictability—driven by the data.

In practice, raw inclusion probabilities βjt\beta_{jt}0 can be further regularized by spline-based smoothing, minimizing KL divergence from the variational posterior (Bianco et al., 2023). This approach encourages interpretable and stable time-varying sparsity.

3. Variational Inference and Algorithmic Details

Inference is conducted via coordinate-ascent variational Bayes (VB), using a mean-field factorization:

βjt\beta_{jt}1

where βjt\beta_{jt}2 are Polya–Gamma auxiliaries introduced for tractability in the Bernoulli–logit link. The evidence lower bound (ELBO) is optimized:

βjt\beta_{jt}3

Key closed-form updates include:

  • βjt\beta_{jt}4: multivariate Gaussian, involving inversion of tridiagonal precision matrices in βjt\beta_{jt}5 time per βjt\beta_{jt}6;
  • βjt\beta_{jt}7: Bernoulli with mode βjt\beta_{jt}8, where

βjt\beta_{jt}9

  • βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)0: multivariate Gaussian, with blockwise structure;
  • βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)1: Polya–Gamma, facilitating efficient updates to the logit inclusion link;
  • βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)2: handled via specialized Newton-type algorithms for non-conjugate log-linear variance (Bianco et al., 2023).

Early dropping of variables (when βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)3 remains below a threshold) further accelerates computation.

Algorithmic summary (CAVI framework):

  1. Initialize all means and covariances.
  2. Iterate:
    • For each variable βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)4, update βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)5.
    • For each time βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)6, update βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)7.
    • Update βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)8 (or βjt=γjtbjt,bj0N(0,k0ηj2),bjt=bj,t1+vjt,vjtN(0,ηj2)\beta_{jt} = \gamma_{jt} b_{jt}, \quad b_{j0} \sim N(0, k_0\eta_j^2), \quad b_{jt} = b_{j,t-1} + v_{jt}, \quad v_{jt}\sim N(0, \eta_j^2)9 if variance is homoskedastic).
  3. Stop on convergence of the ELBO or a chosen tolerance.

Computational complexity is γjt{0,1}\gamma_{jt} \in \{0,1\}0 per iteration, supporting applications with γjt{0,1}\gamma_{jt} \in \{0,1\}1–γjt{0,1}\gamma_{jt} \in \{0,1\}2 and γjt{0,1}\gamma_{jt} \in \{0,1\}3 up to several hundreds (Bianco et al., 2023, Koop et al., 2018).

4. Forecasting, Foresight, and Predictive Distributions

Posterior-based foresight is central, yielding one- or multi-step-ahead predictions that reflect both coefficient uncertainty and the dynamic sparsity pattern:

γjt{0,1}\gamma_{jt} \in \{0,1\}4

Predictive variance includes both coefficient and selection uncertainty:

γjt{0,1}\gamma_{jt} \in \{0,1\}5

Monte Carlo samples can be drawn from the full variational posterior for density forecasting, log-score computation, and uncertainty quantification (Bianco et al., 2023, Koop et al., 2018).

5. Empirical Performance and Scalability

Extensive simulation studies demonstrate that VFDS outperforms both static (horseshoe, SSVS, EMVS, etc.) and dynamic (DVS, DSS) Bayesian selection methods on synthetic and real data:

  • Simulation: Posterior overlap of γjt{0,1}\gamma_{jt} \in \{0,1\}6 with MCMC posterior for always-active parameters is 80–90%; for inactive parameters ≈75%. Lowest mean squared error (MSE) among all competitors. High F1-scores for time-localized inclusion patterns.
  • Macroeconomic forecasting: In FRED-QD and other large macroeconomic datasets (γjt{0,1}\gamma_{jt} \in \{0,1\}7), VFDS achieves point and density forecast improvements over unobserved-component and rolling-TVAR models for horizons up to γjt{0,1}\gamma_{jt} \in \{0,1\}8 quarters. Predictors selected exhibit interpretable temporal dynamics, revealing time-varying Phillips-curve and business-cycle information.
  • Financial forecasting: In equity premium prediction (γjt{0,1}\gamma_{jt} \in \{0,1\}9), persistent selection of a small set of variables (e.g., “max-return,” turnover) aligns with known economic mechanisms, with superior log-score and MSE relative to static/rolling competitors.

Time complexity is linear in both dimensions: γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)0 per iteration. Empirically, VFDS converges in a few dozen iterations and is γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)1–γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)2 faster than prior dynamic spike-and-slab variational Bayes and over an order of magnitude faster than MCMC (Bianco et al., 2023).

6. Interpretability and Application Domains

The temporal trajectories of the inclusion probabilities γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)3 provide interpretable maps of variable relevance over time, supporting scientific and economic hypothesis testing. In inflation forecasting, VFDS selects predictors such as lagged inflation, industrial production, consumer spending, producer price indices, and unemployment—quantifying their dynamic relationship to the forecast target and revealing underlying structures such as demand-supply shocks and shifting Phillips-curve behavior (Bianco et al., 2023). In financial domains, sparsity persists on select portfolios, providing insights into market frictions.

VFDS has also been adapted to domains with cost-aware selection, such as human activity recognition, where the sequential cost-benefit analysis of feature acquisition is crucial for deploying sensor-based systems (Ardywibowo et al., 2022).

7. Summary Table: Core Elements of VFDS

Component Description Core Reference
Model Structure TVP regression + dynamic spike-and-slab + stochastic volatility (Koop et al., 2018)
Inference Coordinate-ascent mean-field variational Bayes with Polya–Gamma augm. (Bianco et al., 2023)
Dynamic Selection Mechanism GMRF-evolving logit γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)4 → time-varying γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)5 (Bianco et al., 2023)
Computational Complexity γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)6 per VB iteration (with drop rules) (Bianco et al., 2023)
Predictive Capability Superior out-of-sample point/density forecast, interpretable sparsity (Bianco et al., 2023)

VFDS is a general, computationally efficient strategy for dynamically and adaptively selecting predictors, suited for high-dimensional settings and supporting interpretable inference on time-varying structures. Its design principles and algorithmic underpinnings position it as a robust foundation for foresight-aware modeling in temporally structured, high-dimensional regression and forecasting applications (Bianco et al., 2023, Koop et al., 2018).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Variational Foresight Dynamic Selection (VFDS).