Papers
Topics
Authors
Recent
Search
2000 character limit reached

Variational & Flow-Based SSM Approximations

Updated 23 February 2026
  • Variational and flow-based SSM approximations are techniques for approximating posteriors in nonlinear, non-Gaussian time-series models.
  • They integrate autoregressive structures and normalizing flows to enable scalable and efficient inference in both continuous and discrete dynamical systems.
  • Innovations like local IAF, particle marginal filtering, and Wasserstein gradient flows enhance computational performance and statistical accuracy.

Variational and flow-based state-space model (SSM) approximations constitute a class of techniques for approximate inference and learning in time-series models parameterized by (potentially nonlinear or non-Gaussian) state evolution and observation mechanisms. These approaches merge variational inference with autoregressive and normalizing flow architectures, particle-based estimators, and continuous-time flow-matching to provide tractable, expressive, and scalable posterior approximations for both continuous and discrete dynamical systems.

1. SSM Posterior Inference and Classical Variational Approaches

Let Xt0:tNX_{t_0:t_N} denote a latent Markov process with observed Yt0:tNY_{t_0:t_N} and parameters θ\theta. The joint model is given by: Xt0p(xt0),XtiXti1,θp(xtixti1,θ),YtiXti,θp(ytixti,θ),X_{t_0} \sim p(x_{t_0}), \quad X_{t_i} \mid X_{t_{i-1}}, \theta \sim p(x_{t_i} \mid x_{t_{i-1}}, \theta), \quad Y_{t_i} \mid X_{t_i}, \theta \sim p(y_{t_i} \mid x_{t_i}, \theta), with joint posterior: p(xt0:tN,θyt0:tN)p(θ)p(xt0)i=1Np(xtixti1,θ)i=0Np(ytixti,θ).p(x_{t_0:t_N},\theta \mid y_{t_0:t_N}) \propto p(\theta)p(x_{t_0})\prod_{i=1}^N p(x_{t_i} \mid x_{t_{i-1}},\theta)\prod_{i=0}^N p(y_{t_i} \mid x_{t_i},\theta). Classical variational inference introduces an approximating family q(xt0:tN,θ;ϕ)q(x_{t_0:t_N}, \theta; \phi), optimized via the evidence lower bound (ELBO): L(ϕ)=Eq(x,θ;ϕ)[logp(θ)+logp(xt0)+i=1Nlogp(xtixti1,θ)+i=0Nlogp(ytixti,θ)logq(xt0:tN,θ;ϕ)].\mathcal{L}(\phi) = \mathbb{E}_{q(x,\theta;\phi)}\Bigg[ \log p(\theta) + \log p(x_{t_0}) + \sum_{i=1}^N \log p(x_{t_i} \mid x_{t_{i-1}},\theta) + \sum_{i=0}^N \log p(y_{t_i} \mid x_{t_i},\theta) - \log q(x_{t_0:t_N},\theta;\phi) \Bigg]. This variational ELBO forms the backbone for modern flow-based and autoregressive SSM approximations (Ryder et al., 2018).

2. Flow-Based and Autoregressive Variational Families

Modern SSM methods leverage normalizing flows and autoregressive architectures to extend variational posteriors beyond simplistic mean-field or linear-Gaussian cases. For continuous latents and parameters, one constructs invertible mappings between noise variables and target latent variables, enabling tractable density evaluation and efficient reparameterization.

Inverse Autoregressive Flow (IAF) in SSMs:

The variational family factorizes as q(xt0:tN,θ)=q(θ)q(xt1:tNθ)q(x_{t_0:t_N}, \theta) = q(\theta)q(x_{t_1:t_N} \mid \theta). For θ\theta, q(θ)q(\theta) combines LL IAF modules with learned permutations. For latent trajectories, a local IAF is introduced: zti0N(0,1),ztij+1=μj+1(Ci,j)(1σj+1(Ci,j))+σj+1(Ci,j)ztij,z_{t_i}^0 \sim \mathcal{N}(0,1), \quad z_{t_i}^{j+1} = \mu^{j+1}(\mathcal{C}_{i,j}) (1 - \sigma^{j+1}(\mathcal{C}_{i,j})) + \sigma^{j+1}(\mathcal{C}_{i,j}) z_{t_i}^j, where Ci,j\mathcal{C}_{i,j} denotes local convolutional receptive fields, and jj indexes flow layers. The final xtix_{t_i} is a deterministic elementwise transform h(ztim)h(z_{t_i}^m). Each local-IAF layer is invertible and tractable, enabling O(Nkd)O(N k d) computation per layer, where kk is receptive field width and dd is latent dimension—this is orders-of-magnitude faster than full-history IAF/MAF, which incur O(N2)O(N^2) cost per layer (Ryder et al., 2018).

Extension to Discrete Latents:

Autoregressive variational posteriors for discrete SSMs (e.g., HMMs) are made GPU-efficient via fixed-point iterations. At each of KK sweeps, all variables are updated in parallel using Gumbel-softmax reparameterization, enabling O(K)O(K) parallel depth as opposed to O(T)O(T) sequential steps: zt(k)=fϕ(z1:t1(k),zt+1:T(k1),εt),εtGumbel(0,1).z_t^{(k)} = f_\phi(z_{1:t-1}^{(k)}, z_{t+1:T}^{(k-1)}, \varepsilon_t), \quad \varepsilon_t \sim \mathrm{Gumbel}(0,1). This can be interpreted as a discrete normalizing flow on relaxed variables, with tractable ELBOs either by relaxing discrete samples or by leveraging the change-of-variables formula for the flow dynamics (Aitchison et al., 2018).

3. Particle Methods, Rao-Blackwellization, and Tractable Bounds

Particle filter-based variational methods provide an unbiased lower bound for intractable SSM posteriors. The Variational Marginal Particle Filter (VMPF) constructs a variational objective from a Rao–Blackwellized particle estimator of the marginal likelihood: p^MPF(y1:T)=t=1T1Ni=1Nvti,\widehat{p}_{\text{MPF}}(y_{1:T}) = \prod_{t=1}^T \frac{1}{N} \sum_{i=1}^N v_t^i, with marginal particle proposals and weights constructed as mixtures over previous-step particles. VMPF achieves a provable tighter lower bound than variational SMC (VSMC) due to variance reduction by Rao–Blackwellization: LVMPF(ϕ,θ)=E[logp^MPF(y1:T;ϕ,θ)]logp(y1:T).L_{\text{VMPF}}(\phi,\theta) = \mathbb{E}\left[\log \widehat{p}_{\text{MPF}}(y_{1:T};\phi,\theta)\right] \le \log p(y_{1:T}). Gradient estimators can be fully differentiable and even unbiased when all mixture samples admit a continuous reparameterization (Lai et al., 2021). Particle-filter-based variational SSMs achieve improved empirical test log-likelihoods over VSMC and IWAE, especially in complex and high-dimensional time-series (e.g., deep Markov models on music data).

Online Variational SMC: OVSMC distributes the optimization of the VSMC surrogate ELBO across time, enabling on-the-fly parameter inference and proposal adaptation with robust convergence guarantees (Mastrototaro et al., 2023).

4. Flow-Matching and Wasserstein Gradient Flows

Recent advances in SSM inference employ gradient flows on the Wasserstein space P2(Rd)\mathcal{P}_2(\mathbb{R}^d) to define variational filtering recursions. At each filtering step, the posterior p(xky0:k)p(x_k|y_{0:k}) is approximated by minimizing

L(q)=KL(qp)=q(x)logq(x)p(ykx)p(xy0:k1)dx.L(q) = \mathrm{KL}(q \| p) = \int q(x) \log \frac{q(x)}{p(y_k \mid x)p(x \mid y_{0:k-1})} dx.

Instead of parameterizing qq in Euclidean coordinates, one defines a Wasserstein gradient flow: tqt(x)+(qt(x)vt(x))=0,\partial_t q_t(x) + \nabla \cdot (q_t(x) v_t(x)) = 0, where vt(x)=φt(x)v_t(x) = -\nabla \varphi_t(x) and φt=δL/δqqt\varphi_t = \delta L/\delta q \rvert_{q_t}.

When qtq_t is restricted to be Gaussian or a mixture of Gaussians, the moment dynamics are ODEs: dmtdt=E[V(Zt)],dPtdt=2IE[V(Zt)(Ztmt)]E[(Ztmt)V(Zt)].\frac{d m_t}{dt} = -\mathbb{E}[\nabla V(Z_t)],\quad \frac{d P_t}{dt} = 2 I - \mathbb{E}[\nabla V(Z_t) \otimes (Z_t-m_t)] - \mathbb{E}[(Z_t-m_t) \otimes \nabla V(Z_t)]. This approach—Variational Wasserstein Filtering (VWF)—improves fidelity in non-Gaussian and multimodal settings where EKF fails, achieving accuracy competitive with particle filters but at reduced computational cost for moderate state dimensions (Corenflos et al., 2023).

5. Variational Flow-Matching for Structured and Hybrid SSMs

Generalizing flow-matching techniques, Pawsterior is a variational flow-matching framework specifically designed for simulation-based inference in SSMs with geometric or discrete/hybrid constraints. The method frames posterior transport as conditional dynamics between endpoints (θ0,θ1)(\theta_0, \theta_1) via an affine interpolation θt=(1t)θ0+tθ1\theta_t = (1-t)\theta_0 + t\theta_1, and parameterizes the flow ODE using two-sided variational networks estimating μ0,t(θt,x)\mu_{0,t}(\theta_t,x) and μ1,t(θt,x)\mu_{1,t}(\theta_t,x): dθdt=μ0,tφ(θ,x)+μ1,tφ(θ,x).\frac{d\theta}{dt} = -\mu_{0,t}^\varphi(\theta, x) + \mu_{1,t}^\varphi(\theta, x). ELBO-style training is performed over endpoint pairs, with losses decomposed by coordinate (Gaussian for continuous, cross-entropy for discrete).

A principled advantage is affine geometric confinement: parameterizing the conditional means such that sampled flows remain within physically valid domains (e.g. boxes, simplexes), e.g., via tanh\tanh or softmax networks. Pawsterior extends flow-matching to posteriors with strictly discrete components (e.g., switching systems), where earlier methods such as FMPE or standard continuous flows fail (Carrasco-Pollo et al., 14 Feb 2026).

Performance metrics such as Classifier Two-Sample Test (C2ST) confirm consistently improved posterior fit over previous flow-matching SBI tools, particularly on bounded and hybrid-latent SSMs. Pawsterior’s formulation further facilitates the solution of stiff ODEs in high-dimensional spaces via adaptive solvers and supports further extension to manifold-constrained latent domains.

6. Computational and Practical Considerations

The computational performance and tractability of variational and flow-based SSM approximations depend on model structure, posterior complexity, and targeted accuracy.

  • Local IAFs scale with O(Nkd)O(N k d) per layer, enabling tractable inference on long series via convolutional architectures, and dramatically outperform O(N2)O(N^2) cost full-history flows (Ryder et al., 2018).
  • Flow-matching and Wasserstein-gradient methods reduce computational cost versus high-particle SMC, but require ODE or PDE integration per update—still tractable for low-to-moderate dimensions (Corenflos et al., 2023, Carrasco-Pollo et al., 14 Feb 2026).
  • Particle variational bounds benefit from variance reduction by marginalization (Rao–Blackwellization), improving sample efficiency at the cost of O(N2T)O(N^2 T) per step, but offering superior tightness and differentiability (Lai et al., 2021).
  • Autoregressive discrete flows attain linear or sub-linear depth in sequence length via parallelized fixed-point sweeps, with only mild ELBO degradation compared to fully sequential autoregressive samplers—critical for large, discrete dynamical systems (Aitchison et al., 2018).

A summary table of complexity per method is provided:

Method Complexity per Update Domain Applicability
Local IAF SSM O(Nkd)O(N k d) Continuous latent SSM
Full-history IAF/MAF O(N2)O(N^2) Continuous (smaller NN)
VMPF / Marginal PF O(N2T)O(N^2 T) Continuous latent SSM
Parallel discrete FP O(KT)O(K T), KTK \ll T Discrete SSM
Wasserstein Flow ODE integration per step Continuous/mixture SSM
Pawsterior VFM ODE integration, O(D)O(D) Hybrid/discrete SSM

7. Empirical Results and Applicability

Empirical evaluations consistently demonstrate that local flow-based variational SSM approximations and particle-marginal approaches provide accurate posterior marginals, parameter estimation, and smoothing trajectories, matching or approximating ground truth or particle filter results with highly reduced runtime. For example:

  • Local IAF outperforms black-box VI and matches forward-filtered marginals in linear and nonlinear SSMs in minutes rather than hours (Ryder et al., 2018).
  • Wasserstein flow filtering tracks multimodal and multiplicative-noise posteriors with accuracy equivalent to high-particle SMC (Corenflos et al., 2023).
  • VMPF provides the tightest ELBO among tractable filtering objectives in deep Markov models and stochastic volatility SSMs (Lai et al., 2021).
  • Discrete flow-based SSM variational posteriors achieve 5–20× sampling speedups over serial autoregressive baselines while maintaining ELBOs within 1–2% of the optimal (Aitchison et al., 2018).
  • Pawsterior uniquely addresses strict bounding and hybrid-discrete SSM support, lowering C2ST versus existing flow-matching SBI baselines (Carrasco-Pollo et al., 14 Feb 2026).

These advances make state-of-the-art variational and flow-based SSM inference feasible for large-scale, nonlinear, and structured time-series data across scientific, engineering, and machine learning applications.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Variational and Flow-Based SSM Approximations.