Papers
Topics
Authors
Recent
Search
2000 character limit reached

Variational Foresight Dynamic Selection (VFDS)

Updated 8 February 2026
  • VFDS is a Bayesian framework for time-varying, context-dependent variable selection that uses variational inference for scalable posterior and predictive analysis.
  • It dynamically models predictors' relevance in high-dimensional regression, yielding interpretable sparsity patterns and improved forecasting in economics and finance.
  • The algorithm employs coordinate-ascent variational Bayes with Polya–Gamma augmentation, achieving linear computational complexity and robust handling of abrupt shifts in predictor importance.

Variational Foresight Dynamic Selection (VFDS) is a Bayesian framework for time-varying, context-dependent variable selection in high-dimensional dynamic models. VFDS is designed to infer dynamically evolving predictive structures, such as changes in which input features or predictors are relevant at each time point, while maintaining computational scalability through variational methods. The framework achieves efficient posterior and predictive inference in time-varying parameter (TVP) regression models, offering robust dynamic variable selection and interpretable sparsity patterns foundational for forecasting in economics, finance, and other domains characterized by temporally dependent data (Koop et al., 2018, Bianco et al., 2023).

1. Problem Formulation and Model Structure

VFDS addresses the challenge where predictor relevance changes over time and across contexts, requiring a mechanism to “foresee” which variables are likely to be informative before observing all data. The model is based on a state-space TVP regression:

yt=j=1pβjtxj,t1+ϵt,ϵtN(0,σt2)y_t = \sum_{j=1}^p \beta_{jt} x_{j,t-1} + \epsilon_t, \quad \epsilon_t \sim N(0, \sigma_t^2)

with latent coefficients βjt\beta_{jt} that evolve as:

%%%%2%%%%

where γjt{0,1}\gamma_{jt} \in \{0,1\} is a time-varying binary inclusion indicator controlled by a Bernoulli–Gaussian (“spike-and-slab”) hierarchy:

γjtωjtBernoulli(expit(ωjt)),ωj0N(0,k0ξj2),ωjt=ωj,t1+ujt,ujtN(0,ξj2)\gamma_{jt}|\omega_{jt} \sim \text{Bernoulli}(\mathrm{expit}(\omega_{jt})), \quad \omega_{j0} \sim N(0, k_0\xi_j^2), \quad \omega_{jt} = \omega_{j,t-1} + u_{jt}, \quad u_{jt} \sim N(0, \xi_j^2)

The variances and hyperparameters (ηj2,ξj2,ν2\eta_j^2, \xi_j^2, \nu^2) may themselves be endowed with conjugate inverse-gamma priors (Bianco et al., 2023). This structure enables both smooth time evolution and abrupt switches in variable importance.

The observation noise variance may evolve with log-stochastic volatility:

σt2=exp(ht),h=(h0,,hn)Nn+1(0,ν2Q1)\sigma_t^2 = \exp(h_t),\quad h = (h_0,\dots,h_n)^\top \sim N_{n+1}(0, \nu^2 Q^{-1})

where QQ is the tridiagonal DLM precision structure encoding temporal regularization.

2. Bayesian Foresight and Dynamic Selection

VFDS operationalizes dynamic selection (“foresight”) by modeling the inclusion probabilities

πjt=P(γjt=1)=expit(ωjt)\pi_{jt} = P(\gamma_{jt}=1) = \mathrm{expit}(\omega_{jt})

with the latent process ωjt\omega_{jt} evolving as a Gaussian Markov random field (GMRF). This captures smooth persistence in the probability of each variable being active, while allowing context-aware, rapid switches—pockets of predictability—driven by the data.

In practice, raw inclusion probabilities πjt\pi_{jt} can be further regularized by spline-based smoothing, minimizing KL divergence from the variational posterior (Bianco et al., 2023). This approach encourages interpretable and stable time-varying sparsity.

3. Variational Inference and Algorithmic Details

Inference is conducted via coordinate-ascent variational Bayes (VB), using a mean-field factorization:

q(θ)=q(h)q(ν2)j=1p[q(bj)q(ωj)q(ηj2)q(ξj2)t=1nq(γjt)q(zjt)]q(\theta) = q(h)q(\nu^2) \prod_{j=1}^p \left[q(b_j)q(\omega_j)q(\eta_j^2)q(\xi_j^2) \prod_{t=1}^n q(\gamma_{jt})q(z_{jt})\right]

where zjtz_{jt} are Polya–Gamma auxiliaries introduced for tractability in the Bernoulli–logit link. The evidence lower bound (ELBO) is optimized:

L[q]=q(θ)logp(y,θ)q(θ)dθ\mathcal{L}[q] = \int q(\theta) \log\frac{p(y,\theta)}{q(\theta)}\, d\theta

Key closed-form updates include:

  • q(bj)q(b_j): multivariate Gaussian, involving inversion of tridiagonal precision matrices in O(n)O(n) time per jj;
  • q(γjt)q(\gamma_{jt}): Bernoulli with mode expit(ω~jt)\mathrm{expit}(\widetilde\omega_{jt}), where

ω~jt=Eq[ωjt]12Eq[1/σt2][xj,t12Eq(bjt2)2xj,t1Eq(bjt)Eq(ϵj,t)]\widetilde\omega_{jt} = E_q[\omega_{jt}] - \frac{1}{2}E_q[1/\sigma_t^2]\left[x_{j,t-1}^2 E_q(b_{jt}^2) - 2 x_{j,t-1} E_q(b_{jt}) E_q(\epsilon_{-j,t})\right]

  • q(ωj)q(\omega_j): multivariate Gaussian, with blockwise structure;
  • q(zjt)q(z_{jt}): Polya–Gamma, facilitating efficient updates to the logit inclusion link;
  • q(h)q(h): handled via specialized Newton-type algorithms for non-conjugate log-linear variance (Bianco et al., 2023).

Early dropping of variables (when πjt\pi_{jt} remains below a threshold) further accelerates computation.

Algorithmic summary (CAVI framework):

  1. Initialize all means and covariances.
  2. Iterate:
    • For each variable jj, update q(bj),q(ηj2),q(ωj),q(ξj2)q(b_j), q(\eta_j^2), q(\omega_j), q(\xi_j^2).
    • For each time tt, update q(zjt),q(γjt)q(z_{jt}), q(\gamma_{jt}).
    • Update q(h),q(ν2)q(h), q(\nu^2) (or q(σ2)q(\sigma^2) if variance is homoskedastic).
  3. Stop on convergence of the ELBO or a chosen tolerance.

Computational complexity is O(pn)O(pn) per iteration, supporting applications with p=100p=100–$400$ and nn up to several hundreds (Bianco et al., 2023, Koop et al., 2018).

4. Forecasting, Foresight, and Predictive Distributions

Posterior-based foresight is central, yielding one- or multi-step-ahead predictions that reflect both coefficient uncertainty and the dynamic sparsity pattern:

Eq[yT+hy1:T]=j=1pxj,T+h1Eq[γj,T+h]Eq[bj,T+h]E_q[y_{T+h}|y_{1:T}] = \sum_{j=1}^p x_{j,T+h-1} E_q[\gamma_{j, T+h}] E_q[b_{j,T+h}]

Predictive variance includes both coefficient and selection uncertainty:

Varq(yT+h...)xT+h1[Varq(βT+h)+Eq[γ]Varq(b)+]xT+h1+Eq[σT+h2]\mathrm{Var}_q(y_{T+h}|...) \approx x_{T+h-1}^\top \left[ \mathrm{Var}_q(\beta_{T+h}) + E_q[\gamma]\mathrm{Var}_q(b) + \cdots \right] x_{T+h-1} + E_q[\sigma_{T+h}^2]

Monte Carlo samples can be drawn from the full variational posterior for density forecasting, log-score computation, and uncertainty quantification (Bianco et al., 2023, Koop et al., 2018).

5. Empirical Performance and Scalability

Extensive simulation studies demonstrate that VFDS outperforms both static (horseshoe, SSVS, EMVS, etc.) and dynamic (DVS, DSS) Bayesian selection methods on synthetic and real data:

  • Simulation: Posterior overlap of q(βjt)q^*(\beta_{jt}) with MCMC posterior for always-active parameters is 80–90%; for inactive parameters ≈75%. Lowest mean squared error (MSE) among all competitors. High F1-scores for time-localized inclusion patterns.
  • Macroeconomic forecasting: In FRED-QD and other large macroeconomic datasets (p229p\approx229), VFDS achieves point and density forecast improvements over unobserved-component and rolling-TVAR models for horizons up to h=8h=8 quarters. Predictors selected exhibit interpretable temporal dynamics, revealing time-varying Phillips-curve and business-cycle information.
  • Financial forecasting: In equity premium prediction (p=153p=153), persistent selection of a small set of variables (e.g., “max-return,” turnover) aligns with known economic mechanisms, with superior log-score and MSE relative to static/rolling competitors.

Time complexity is linear in both dimensions: O(pn)O(pn) per iteration. Empirically, VFDS converges in a few dozen iterations and is $3$–4×4\times faster than prior dynamic spike-and-slab variational Bayes and over an order of magnitude faster than MCMC (Bianco et al., 2023).

6. Interpretability and Application Domains

The temporal trajectories of the inclusion probabilities πjt\pi_{jt} provide interpretable maps of variable relevance over time, supporting scientific and economic hypothesis testing. In inflation forecasting, VFDS selects predictors such as lagged inflation, industrial production, consumer spending, producer price indices, and unemployment—quantifying their dynamic relationship to the forecast target and revealing underlying structures such as demand-supply shocks and shifting Phillips-curve behavior (Bianco et al., 2023). In financial domains, sparsity persists on select portfolios, providing insights into market frictions.

VFDS has also been adapted to domains with cost-aware selection, such as human activity recognition, where the sequential cost-benefit analysis of feature acquisition is crucial for deploying sensor-based systems (Ardywibowo et al., 2022).

7. Summary Table: Core Elements of VFDS

Component Description Core Reference
Model Structure TVP regression + dynamic spike-and-slab + stochastic volatility (Koop et al., 2018)
Inference Coordinate-ascent mean-field variational Bayes with Polya–Gamma augm. (Bianco et al., 2023)
Dynamic Selection Mechanism GMRF-evolving logit ωjt\omega_{jt} → time-varying πjt\pi_{jt} (Bianco et al., 2023)
Computational Complexity O(pn)O(pn) per VB iteration (with drop rules) (Bianco et al., 2023)
Predictive Capability Superior out-of-sample point/density forecast, interpretable sparsity (Bianco et al., 2023)

VFDS is a general, computationally efficient strategy for dynamically and adaptively selecting predictors, suited for high-dimensional settings and supporting interpretable inference on time-varying structures. Its design principles and algorithmic underpinnings position it as a robust foundation for foresight-aware modeling in temporally structured, high-dimensional regression and forecasting applications (Bianco et al., 2023, Koop et al., 2018).

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Variational Foresight Dynamic Selection (VFDS).