Papers
Topics
Authors
Recent
Search
2000 character limit reached

Online Sequential Bayesian Updating

Updated 20 April 2026
  • Online sequential Bayesian updating is a recursive inference method that incrementally refines posterior distributions as new data arrives.
  • It employs exact conjugate updates, variational approximations, and particle filtering to handle diverse models under streaming and dynamic conditions.
  • The approach is vital for real-time forecasting, anomaly detection, and continual learning, ensuring computational efficiency in high-frequency data regimes.

Online Sequential Bayesian Updating is a family of methodologies for recursive statistical inference, wherein the posterior distribution is updated incrementally as new observations or data batches arrive, leveraging the previous posterior as the new prior. This paradigm underpins a substantial portion of the modern Bayesian literature on inference under streaming, high-frequency, or distributed data regimes, and finds theoretical justification as the natural operationalization of Bayes’ theorem in the presence of sequential or temporally indexed data streams. The approach is applicable across parametric, nonparametric, latent-variable, and dynamic state-space models; can be realized exactly or with approximations; and aligns with loss-based generalizations (e.g., Gibbs posteriors in PAC-Bayes) for broader online learning and decision-theoretic settings.

1. Theorem of Recursive Bayesian Updating

At its core, online Bayesian updating is described by the recursive formula

pt(θ)pt1(θ)L(Dtθ)p_t(\theta) \propto p_{t-1}(\theta)\, L(D_t|\theta)

where pt1(θ)p_{t-1}(\theta) is the posterior after the first t1t-1 blocks or points, L(Dtθ)L(D_t|\theta) is the likelihood for the new block or observation, and pt(θ)p_{t}(\theta) is the updated posterior. In canonical streaming applications, observations may arrive singly, or in mini-batches DtD_t, and the update proceeds using only prior sufficient statistics and the new data—never the full history (Hooten et al., 2018, Lee et al., 8 Apr 2025, Kamariotis et al., 2021). In dynamic models (e.g., state-space, filtering), latent variables (e.g., xtx_t in HMMs) are also handled recursively using model-specific marginalization.

Several formulations exist:

Thus, sequential online updating is not confined to regression or i.i.d. settings but unifies Bayesian filtering, latent-variable inference, and recursive structure learning under a common framework.

2. Exact Bayesian Filtering and Conjugate Cases

In models where prior and likelihood are conjugate (e.g., normal–normal, exponential-family–conjugate pairs), each update is analytic, and the sufficient statistics (moments, counts, etc.) can be incrementally maintained. This enables O(1)O(1) or batch-size-complexity online inference (Lee et al., 8 Apr 2025, Dinh et al., 2016, Romeres et al., 2016, Aktekin et al., 2016). For example:

  • Kalman filter: Exact Gaussian update of mean μt\mu_t and covariance pt1(θ)p_{t-1}(\theta)0 per observation, applicable to Bayesian neural network weights under linear–Gaussian likelihoods (Wagner et al., 2021, Duran-Martin, 12 May 2025, Romeres et al., 2016).
  • Bayesian model selection: Conjugate priors enable variable inclusion and marginal-likelihood updating with Laplace, BIC, or renewable-summary approximations (Ghosal et al., 19 Jan 2025).
  • Dynamic models: Analytic updates of filtering distributions for state and static parameters (e.g., Gamma–Poisson for count models) using sufficient statistics (Aktekin et al., 2016).

This leads to algorithms that maintain only low-dimensional summaries and do not re-access full data, suitable for high-velocity or memory-limited streaming applications.

3. Approximate and Variational Methods

When conjugacy or analytic tractability is absent, approximate inference methods enable online sequential Bayesian updating.

Variational Bayes (VB) and Extensions

  • Online Variational Bayes: Given approximating family pt1(θ)p_{t-1}(\theta)1, each update targets the pseudo-posterior

pt1(θ)p_{t-1}(\theta)2

using the new data's likelihood and previous VB approximation as prior. One minimizes pt1(θ)p_{t-1}(\theta)3 via stochastic gradient ascent, typically implementing updates over only the new data and thereby reducing per-step computational burden to pt1(θ)p_{t-1}(\theta)4 (Tomasetti et al., 2019, Lee et al., 8 Apr 2025, Kochurov et al., 2018).

pt1(θ)p_{t-1}(\theta)5

where the prior for the ELBO at pt1(θ)p_{t-1}(\theta)6 is the previous approximate posterior (Kochurov et al., 2018, Tomasetti et al., 2019).

  • Online Bernstein–von Mises: Under mild smoothness and batch-size-to-dimension scaling (pt1(θ)p_{t-1}(\theta)7), the composition of Gaussian approximations at each update retains frequentist validity and is asymptotically equivalent (in total variation) to the batch posterior (Lee et al., 8 Apr 2025).
  • Importance-sampling-based updates (UVB-IS): Various strategies reuse samples from the prior q, weighting for new likelihood contributions to further accelerate updates at minimal loss in accuracy (Tomasetti et al., 2019).

Sequential Monte Carlo (SMC) and Particle Methods

  • Particle Filter / SMC: Particles represent current posterior ensemble pt1(θ)p_{t-1}(\theta)8, updated via

pt1(θ)p_{t-1}(\theta)9

with periodic resampling (when effective sample size degrades) and often followed by rejuvenation moves (e.g., MCMC, kernel smoothing) to avoid particle impoverishment (Menictas et al., 2023, Dinh et al., 2016, Xie et al., 25 Nov 2025).

Generalisations and Robustified Updates

  • Gibbs/Generalized Posteriors: Online updating via pseudo-likelihoods or general loss functions (e.g., exponentiated regret, adversarial tasks) produces Gibbs posteriors and is key for regret minimization and PAC-Bayes-motivated online learning; see, e.g.,

t1t-10

with SMC sampling and theoretical O(√T) regret bounds for bounded, mixable losses (Xie et al., 25 Nov 2025, Wu et al., 2024, Duran-Martin, 12 May 2025).

  • Robust Bayesian filters: Loss-adapted or weighted updates (using, e.g., Mahalanobis or robust loss weighting) maintain sequential updating under outlier or model-misspecification regimes, sometimes preserving Kalman filter form (Duran-Martin, 12 May 2025).

4. Non-Stationarity, Adaptivity, and Memory Design

Classical recursive Bayes presumes static parameters. Extensions to non-stationary, drift, or changepoint regimes require memory or model-adaptive mechanisms.

  • Forgetting/Adaptive Memory: Mechanisms downweight or selectively recall past data to facilitate adaptation to regime switches, recurring environments, or non-stationarity. BAM (Bayes with Adaptive Memory) introduces a greedy (approximate) optimization of which past datapoints to remember, generalizing fixed forgetting, sliding windows, power priors, and unlearning as special cases (Nassar et al., 2022).
  • Runlength- or changepoint-aware priors: Models such as Bayesian online changepoint detection (0710.3742) or adaptive filtering (Duran-Martin, 12 May 2025) parameterize priors/updates by current runlength or environmental state, enabling immediate (and uncertainty-aware) learning upon regime switches.
  • Drift and covariance inflation: Online filters inject artificial dynamics or rescale prior covariance ensuring posterior readiness for shifts without overconfidence accumulation (Duran-Martin, 12 May 2025, Duran-Martin et al., 13 Jun 2025).

5. Algorithmic and Computational Aspects

Efficient online Bayesian inference requires control of storage, compute, and approximation complexity.

  • Sufficient statistics storage: For exponential-family likelihoods and Gaussian models, summary statistics (e.g., sums, empirical covariances) are maintained and updated in O(parameter dimension2) per step (Ghosal et al., 19 Jan 2025, Duran-Martin, 12 May 2025, Menictas et al., 2023).
  • Particle filters and SMC: Per-step cost is O(particle count), scalability controlled by bounding the variance of incremental weights, with stability guaranteed by resampling and theory for lower bounds on effective sample size even as model dimensions increase (Dinh et al., 2016, Menictas et al., 2023).
  • Kalman updates and block structure: For high-dimensional models such as deep Bayesian neural networks, blockwise or low-rank updates for groups of weights allow propagation of posterior uncertainty and reduction of computational cost, while retaining well-defined predictive distributions (Duran-Martin et al., 13 Jun 2025, Wagner et al., 2021).
  • Batch-to-online translation: Many MCMC/VI pipelines are emulable in online/recursive form via updating with each batch's log-likelihood, using previous approximated posterior as the new prior without refitting the model on all data (Hooten et al., 2018, Lee et al., 8 Apr 2025, Tomasetti et al., 2019).

6. Guarantees, Theory, and Empirical Findings

  • Bernstein–von Mises for Online VB: Provided mini-batch size exceeds a critical threshold depending on parameter dimension and number of steps, sequential variational updates deliver asymptotically normal posteriors that are indistinguishable from full-batch posteriors in total variation distance (Lee et al., 8 Apr 2025).
  • O(√T) Regret for Bayesian Online Optimization: Gibbs-posterior-based online Bayesian updating yields O(√T) regret bounds for contextual optimization and learning with bounded and mixable losses (Xie et al., 25 Nov 2025, Wu et al., 2024).
  • Consistency and Efficiency of Particle Methods: For online SMC/particle learning in growing-dimensional or phylogenetic models, stability and consistency are guaranteed as particle count increases, with effective sample size growing linearly and no exponential degeneracy (Dinh et al., 2016, Aktekin et al., 2016).
  • Avoidance of Catastrophic Forgetting: In deep learning, retaining the previous approximate posterior as the new prior suppresses catastrophic forgetting compared to naïve fine-tuning, as empirically confirmed for neural networks on sequential tasks (Kochurov et al., 2018).
  • Limitations and Trade-offs: Fully online (per data-point) VB or SMC may accumulate approximation error unless batch sizes scale appropriately, and SMC for non-Gaussian likelihoods imposes higher per-update costs due to full-data revisit (Tomasetti et al., 2019, Menictas et al., 2023). Memory selection in adaptive-memory filters is NP-hard, often mitigated via heuristics (Nassar et al., 2022). Complexity control and tuning are thus critical for reliable and scalable deployment.

7. Applications and Broader Contexts

Online sequential Bayesian updating is foundational in domains requiring continual, instantaneous, or memory-limited inference:

Its integration with robust, adaptive, and scalable methodologies continues to drive advances in model-based learning, scalable inference, and statistical decision-theoretic frameworks.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Online Sequential Bayesian Updating.