Comparison of Bayesian predictive methods for model selection (1503.08650v4)

Published 30 Mar 2015 in stat.ME and cs.LG

Abstract: The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.

Citations (272)

View on Semantic Scholar

Summary

The paper demonstrates that incorporating model uncertainty via Bayesian model averaging and projection predictive approaches effectively reduces overfitting.
It uses extensive numerical experiments on simulated and real-world datasets to compare methods like CV, WAIC, and DIC.
The study provides a practical framework for balancing predictive accuracy with model complexity in variable selection tasks.

Comparison of Bayesian Predictive Methods for Model Selection: An Expert Overview

The paper by Piironen and Vehtari presents a rigorous comparison of several Bayesian methods for model selection, focusing on practical applications in variable subset selection for regression and classification. The authors analyze a variety of commonly used Bayesian predictive methods, namely cross-validation (CV), the widely applicable information criterion (WAIC), deviance information criterion (DIC), and others including reference and projection predictive approaches. These methods are evaluated against the task of identifying useful models for prediction.

The authors emphasize that predictive model selection aims not to find a true underlying model but rather a model with optimal predictive performance. The comparative paper is underscored by several numerical experiments using both simulated and real-world data. The experiments illustrate that traditional methods such as CV, WAIC, and DIC, though valuable, are prone to overfitting, especially in situations with scarce data. This overfitting arises from high variance in utility estimates, leading to selection bias and overly optimistic performance evaluations.

To combat these issues, the paper advocates for methods that incorporate model uncertainty effectively. The authors find that encompassing approaches like Bayesian model averaging (BMA) provide superior predictive results on average because they account for model uncertainty across all candidate models. When full BMA is computationally impractical, or model complexity forbids it, the authors propose the projection predictive method as a robust alternative. This method simplifies the full model by projecting its information onto submodels, fostering models that retain predictive accuracy with reduced overfitting compared to methods like CV and WAIC.

A notable innovation in their paper is the utilization of cross-validation outside the model selection process. This meta-evaluation aids in guiding model size selection and validating the predictive performance of the ultimately selected model. This approach diminishes overfitting by reducing the number of model comparisons, which is particularly beneficial in scenarios with extensive candidate models, such as variable selection.

Quantitative results from the paper showcase the superiority of the projection method over maximum a posteriori (MAP) models and Median probability models. Specifically, the projection approach's ability to mitigate overfitting results in smaller, simpler models with predictive performance nearing that of BMA.

The practical impact of this research extends to various applications in predictive modeling where model complexity and data scarcity are pivotal considerations. Theoretically, it advances Bayesian model selection frameworks by integrating model uncertainty into the selection process more effectively than traditional methods.

Future work in AI developments might explore refinements of the projection method, specifically regarding computational efficiency and adaptability to large-scale, complex data. Additionally, further research could expand upon cross-validation strategies to optimize their utility in evaluating model selection processes comprehensively.

In conclusion, Piironen and Vehtari's paper provides a thorough examination of Bayesian predictive methods for model selection, backed by substantial empirical evidence. It underscores the efficiency of incorporating model uncertainty for improved predictive accuracy, offering valuable insights for researchers and practitioners in the field.

PDF Markdown

Comparison of Bayesian predictive methods for model selection (1503.08650v4)

Summary

Comparison of Bayesian Predictive Methods for Model Selection: An Expert Overview

Related Papers