Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $β$-Divergences

Published 6 Jun 2018 in stat.ML and cs.LG | (1806.02261v2)

Abstract: We present the very first robust Bayesian Online Changepoint Detection algorithm through General Bayesian Inference (GBI) with $\beta$-divergences. The resulting inference procedure is doubly robust for both the parameter and the changepoint (CP) posterior, with linear time and constant space complexity. We provide a construction for exponential models and demonstrate it on the Bayesian Linear Regression model. In so doing, we make two additional contributions: Firstly, we make GBI scalable using Structural Variational approximations that are exact as $\beta \to 0$. Secondly, we give a principled way of choosing the divergence parameter $\beta$ by minimizing expected predictive loss on-line. Reducing False Discovery Rates of CPs from more than 90% to 0% on real world data, this offers the state of the art.

Abstract PDF Upgrade to Chat

Authors (3)

Citations (54)

View on Semantic Scholar

Summary

Overview of Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $\beta$-Divergences

This paper introduces an innovative approach to changepoint detection in non-stationary streaming data through a robust Bayesian framework using $\beta$-divergences. The proposed method addresses the high false discovery rates common in standard Bayesian On-line Changepoint Detection (BOCPD), particularly in the presence of outliers, by integrating robustness directly into the inference process. The core contributions of this study are the application of General Bayesian Inference (GBI) with $\beta$-divergences, the development of structural variational approximations for scalable computation, and a principled approach to optimizing the divergence parameter $\beta$.

Robustness in Bayesian Inference

Traditional Bayesian inference often relies on minimizing the Kullback-Leibler divergence between a probabilistic model and the hypothetical data-generating process. While effective in the M-closed world, it fails to cope with outliers or model misspecification due to its strictly increasing influence function. The paper proposes using $\beta$-divergence in GBI, which introduces a unique maximum in the influence function, allowing for effective handling of outliers. The influence of observations increases initially but decreases sharply once they deviate significantly from the posterior mean, treating them as outliers. The robustness parameter $\beta$ regulates the degree of robustness, ensuring a single outlier does not lead to false changepoint declarations.

Scalability and Computational Efficiency

One of the key challenges of robust Bayesian inference, particularly in streaming data contexts, is computational scalability. The paper addresses this by introducing a structured variational approximation that preserves parameter dependence and mirrors the conjugate distribution of $\beta\to 0$. This approach allows for efficient computation, overcoming the historical bottleneck associated with GBI's intractable posteriors. The structural variational inference method provides a nearly exact fit, and Theorem 2 confirms that the approximation reduces to a solvable form for many exponential family models.

Optimizing the Divergence Parameter $\beta$

Choosing an appropriate value for $\beta$ is crucial for effective robust inference. The paper offers a systematic method to initialize $\beta$ by predicting the expected influence distribution, with further refinement achieved via minimizing predictive losses during model execution. This dynamic optimization ensures that the robustness adapts to changes in data characteristics, enabling more accurate model fitting.

Experimental Validation and Practical Implications

The paper validates its approach through simulations and real-world datasets, including the well-log data—a benchmark for changepoint detection—and a high-dimensional analysis of air pollution levels in London. The robust method reduces false discovery rates significantly and offers probabilistic forecasts, making it applicable across diverse domains such as genetics, finance, and cybersecurity. By integrating robustness into parameter and run-length posteriors, the approach provides more reliable inference and uncertainty quantification.

Future Developments

This study opens avenues for applying robust Bayesian inference to a broader range of models beyond the standard changepoint detection framework. The integration of $\beta$-divergences into GBI can be extended to other settings, fostering improved handling of data heterogeneity and outliers. The established computational efficiency and scalability provide a foundation for exploring robust inference in large-scale and high-dimensional data environments, which increasingly characterize modern machine learning tasks.

In summary, this paper represents a substantial advancement in the application and scalability of robust Bayesian inference mechanisms, promoting their use in dynamic and non-stationary data settings. The presented innovations in computational methods and robustness parameter optimization set the stage for broader applicability and refinement in future research endeavors.