Continuous-Time Evidence Lower Bound
- Continuous-Time ELBO is a variational lower bound for continuous-time SDE models that integrates data likelihoods with stochastic optimal control for irregular time series.
- It employs Doob’s h-transform and neural control parameterizations to derive tractable approximations via a stochastic optimal control framework.
- The approach enables efficient learning and simulation-free inference using piecewise linear drift approximations and modern network architectures.
The continuous-time Evidence Lower Bound (ELBO) is a variational lower bound formulated for probabilistic models that evolve according to continuous-time latent state-space dynamics, notably those driven by stochastic differential equations (SDEs). It provides a foundation for scalable inference and learning in irregularly observed time series, enabling the integration of data likelihoods and pathwise regularization. The continuous-time ELBO considered here arises from a stochastic optimal control (SOC) perspective, establishing a rigorous connection between Doob’s -transform, Feynman–Kac path measures, and amortized variational inference with neural control parameterizations (Park et al., 2024).
1. Feynman–Kac Path Measures and the Posterior in State-space SDEs
Let denote a -valued diffusion evolving under an SDE
where is the drift and standard Brownian motion. Observations are made at irregular time-stamps , each with likelihood . The joint path-observation posterior, or the Feynman–Kac model, is
with the marginal likelihood. Normalized potentials are defined as , , with the property .
A multi-marginal Doob's -transform yields the posterior dynamics as
where is a “backward survival” function propagating posterior information. This SDE generates exactly the posterior law with the correct initial condition .
2. Variational Family, Amortization, and Auxiliary Variables
The intractability of motivates a tractable variational family: controlled SDEs parameterized by neural controls,
with induced path law . In amortized inference, per-observation latent variables are encoded via and decoded with . The control is parameterized to depend on latent histories or the full latent collection , typically via a transformer or RNN.
3. Stochastic Optimal Control Formulation and Dynamic Programming
SOC theory provides a variational foundation for continuous-time inference. Define the cost functional
The value function satisfies, on subintervals , the Hamilton–Jacobi–Bellman (HJB) PDE
where is the infinitesimal generator of the prior SDE. The minimizer is . The Hopf–Cole transform linearizes the HJB, relating to the solution of backward Kolmogorov equations and recovering the Doob control.
4. Variational Bound and Continuous-Time ELBO Construction
Using Girsanov’s theorem,
the KL divergence between the variational path law and the path posterior is
excluding a vanishing initial-law term at optimum. Setting
yields a tight variational characterization: , so that . Minimizing corresponds to optimal control and tightens the ELBO.
Combining the path-space bound with the standard VAE objective over latent variables, the negative ELBO is given by
where depends on encoded latents . The key object is
This structure enables end-to-end training by maximizing across all variational and generative parameters.
5. Assumptions, Practical Strategies, and Simulation-free ELBO
All drifts and controls are assumed Lipschitz with linear growth to guarantee strong solution existence for SDEs. The Hopf–Cole transform and use of Girsanov require Novikov-type moment conditions for validity. In practical implementation, the optimal drift is replaced by a parameteric neural control optimized via the ELBO.
Costly simulation of pathwise SDEs and backpropagation through continuous-time integrators can be circumvented by a piecewise locally linear drift ansatz: for , so that state marginals evolve as Gaussian processes with closed-form updates. This approximation enables efficient, simulation-free parallel ELBO computation. Amortized control construction is performed by modern attention-based networks (e.g., transformers) operating over .
6. Summary and Practical Implementation
The continuous-time ELBO, as realized in the "Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series," synthesizes stochastic optimal control, Feynman–Kac representations, and deep amortized inference, resulting in an end-to-end objective: All nested expectations are tractable using 1. Sampling of from the encoder, 2. Neural construction of via sequence models, 3. Either numerical SDE simulation or closed-form marginal propagations in the piecewise linear case, 4. Likelihood decoding via . The formulation provides a theoretically grounded and computationally practical route to sequential data assimilation in continuous time, particularly for irregular time series (Park et al., 2024).