- The paper introduces a unified framework that uses Bregman divergences for Bayesian model selection, enhancing robustness to outliers and model misspecification.
- The methodology leverages score-matched updating, prediction, and evaluation to produce consistent posterior predictive densities with strong theoretical guarantees.
- The approach, especially using β-divergence, demonstrates improved empirical performance and computational efficiency across diverse real-world applications.
Robust Bayesian Predictive Model Selection Using Bregman Divergence
Summary and Context
The paper introduces a generalized framework for Bayesian predictive model selection based on Bregman divergences, generalizing standard log-score-based approaches such as leave-one-out expected log predictive density (ELPD). The approach aims to increase robustness in predictive model comparison, especially with respect to outliers and model misspecification, by leveraging proper scoring rules derived from Bregman divergences (BDs), with particular emphasis on the β-divergence subclass.
The methodology systematically aligns prior updating, posterior predictive construction, and model comparison using a single, non-log scoring rule. This joint score matching allows the selected model to optimally approximate the true data-generating process (DGP) under the chosen divergence. The framework provides theoretical guarantees for consistency and robustness and demonstrates practical gains in diverse applied domains, such as microbial ecology and forensic science.
Technical Contributions
Generalized Bayesian Predictive Model Comparison
The paper replaces the log-score (and its implicit Kullback–Leibler divergence) with a general Bregman scoring rule within leave-one-out cross-validation (LOO-CV), yielding a generalized ELPD (g-ELPD). For each candidate model, parameters are updated through generalized Bayes via the chosen Bregman score. The resulting posterior predictive density is used for out-of-sample utility evaluation under the same score:
- Score-matched updating: Parameters updated using the Bregman-induced loss.
- Score-matched prediction: Posterior predictive constructed as the Bregman centroid (Bayes action) of the model family.
- Score-matched evaluation: Out-of-sample utility measured with the same proper scoring rule.
This construction ensures that the prediction target is the distribution closest, in the chosen Bregman divergence, to the true DGP.
Theoretical Guarantees
The authors provide a rigorous asymptotic theory:
- Posterior Concentration: The generalized posterior concentrates around pseudo-true parameters minimizing the expected Bregman divergence to the DGP.
- Predictive Consistency: The posterior predictive density converges to the pseudo-true predictive density.
- Model Selection Consistency: LOO-CV using the same Bregman score consistently selects the model whose predictive density is minimax under the target divergence (see Theorems 4.2–4.4).
They additionally present results on the limiting behavior under score-mismatched updating and evaluation (i.e., when the Bregman score used for parameter updating differs from the one used for evaluation), formally showing that such procedures are consistent for a hybrid criterion, rather than the pure score-matched objective.
Robustness via β-Divergence
Special attention is given to the β-divergence family. Unlike the KL divergence, β-divergence downweights low-density observations for β > 1, providing bounded pairwise score contributions (Proposition 3.2) and stability to contamination (Corollary 3.3). This is theoretically and empirically shown to mitigate the undue influence of outliers and tail mismatches often present in real-world data and mis-specified models.
Computation
The authors adapt Pareto-smoothed importance sampling (PSIS-LOO) [Vehtari et al.] to the generalized Bregman posterior, ensuring g-ELPD can be efficiently computed for large datasets without the need to explicitly refit the posterior for every leave-one-out split.
Empirical Demonstrations
Three comprehensive applications illustrate the framework’s impact:
- Simulated Contaminated Normal vs. Miscentered Heavy-Tailed Model: Varying β demonstrates that log-score-based ELPD is dominated by heavy-tailed outliers, favoring miscentered models, while modest β > 1 corrects this by re-focusing on central predictive accuracy.
- Thermal Performance Curves (TPCs) in Microbial Ecology: When comparing various TPC models, standard ELPD selects a model sensitive to boundary outliers; g-ELPD with β > 1 leads to the selection of models better generalizing central tendencies.
- Spatial Modeling for Forensic Footwear Analysis: On a forensic benchmark, the log-score heavily penalizes models due to a single outlier, while g-ELPD with increased β identifies models with superior overall fit, limiting the leverage of outliers.
Strong numerical results support the claim that generalized, score-matched model selection achieves both improved robustness and desirable asymptotic properties.
Implications and Future Directions
The paper’s framework directly addresses longstanding issues in Bayesian model comparison caused by the sensitivity of predictive criteria to outliers and model misspecification. The use of Bregman divergences provides a principled path to robustification without sacrificing coherence or statistical efficiency. The demonstrated theoretical consistency and empirical robustness open new directions:
- Practical robustness: The ability to tune β or use alternative Bregman divergences allows practitioners to adapt to specific domain needs and data characteristics, with reporting of g-ELPD over a range of β enhancing transparency.
- Computational scalability: The adaptation of PSIS-LOO ensures practical tractability for moderate to large datasets; further research into scalable score approximations in high-dimensional or complex observation spaces is warranted.
- Generalization to model averaging and stacking: The same machinery could yield robust predictive weighting strategies beyond winner-take-all selection.
- Uncertainty quantification: Extensions to rigorous standard errors for generalized-score model comparisons remain an open avenue for investigation.
Conclusion
This work provides a robust, theoretically grounded, and practically implementable extension of Bayesian predictive model selection using Bregman divergences. By unifying parameter updating, prediction, and evaluation under a single, tunable proper scoring rule, the method delivers consistent and outlier-resistant model selection in the M-open regime. Application to varied domains demonstrates empirical effectiveness, and the theoretical results clarify the precise conditions under which robust model selection is achieved. The framework generalizes naturally to alternative divergences and predictive combination methods, pointing toward a flexible and robust future for Bayesian model comparison.
Reference:
"Robust Bayesian Predictive Model Selection using Bregman Divergence" (2606.10409)