Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Bayes Factors (2312.05411v3)

Published 8 Dec 2023 in stat.ME, stat.CO, and stat.ML

Abstract: The is no other model or hypothesis verification tool in Bayesian statistics that is as widely used as the Bayes factor. We focus on generative models that are likelihood-free and, therefore, render the computation of Bayes factors (marginal likelihood ratios) far from obvious. We propose a deep learning estimator of the Bayes factor based on simulated data from two competing models using the likelihood ratio trick. This estimator is devoid of summary statistics and obviates some of the difficulties with ABC model choice. We establish sufficient conditions for consistency of our Deep Bayes Factor estimator as well as its consistency as a model selection tool. We investigate the performance of our estimator on various examples using a wide range of quality metrics related to estimation and model decision accuracy. After training, our deep learning approach enables rapid evaluations of the Bayes factor estimator at any fictional data arriving from either hypothesized model, not just the observed data $Y_0$. This allows us to inspect entire Bayes factor distributions under the two models and to quantify the relative location of the Bayes factor evaluated at $Y_0$ in light of these distributions. Such tail area evaluations are not possible for Bayes factor estimators tailored to $Y_0$. We find the performance of our Deep Bayes Factors competitive with existing MCMC techniques that require the knowledge of the likelihood function. We also consider variants for posterior or intrinsic Bayes factors estimation. We demonstrate the usefulness of our approach on a relatively high-dimensional real data example about determining cognitive biases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (90)
  1. Ali, S. M. and S. D. Silvey (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society: Series B (Methodological) 28(1), 131–142.
  2. Hypothesis assessment and inequalities for Bayes factors and relative belief ratios.
  3. Robust Bayesian analysis of selection models. The Annals of Statistics 26(2), 645–659.
  4. Bayarri, M. and J. O. Berger (2000). P values for composite null models. Journal of the American Statistical Association 95(452), 1127–1142.
  5. Approximate Bayesian computation in population genetics. Genetics 162(4), 2025–2035.
  6. Mutual information neural estimation. In International conference on machine learning, pp.  531–540. PMLR.
  7. Bennett, C. H. (1976). Efficient estimation of free energy differences from Monte Carlo data. Journal of Computational Physics 22(2), 245–268.
  8. Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis. Springer Science & Business Media.
  9. Berger, J. O. and L. R. Pericchi (1996). The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association 91(433), 109–122.
  10. Testing a point null hypothesis: The irreconcilability of p values and evidence. Journal of the American statistical Association 82(397), 112–122.
  11. Bernardo, J. M. and A. F. Smith (2009). Bayesian theory, Volume 405. John Wiley & Sons.
  12. Bishara, A. J. and B. K. Payne (2009). Multinomial process tree models of control and automaticity in weapon misidentification. Journal of Experimental Social Psychology 45(3), 524–534.
  13. Box, G. E. (1980). Sampling and Bayes’ inference in scientific modelling and robustness. Journal of the Royal Statistical Society Series A: Statistics in Society 143(4), 383–404.
  14. Bayesian optimal reconstruction of the primordial power spectrum. Monthly Notices of the Royal Astronomical Society 400(2), 1075–1084.
  15. Chopin, N. and C. P. Robert (2010). Properties of nested sampling. Biometrika 97(3), 741–755.
  16. DB, R. (1988). Using the SIR algorithm to simulate posterior distributions. In Bayesian statistics 3. Proceedings of the third Valencia international meeting, 1-5 June 1987, pp.  395–402. Clarendon Press.
  17. Likelihood-free estimation of model evidence.
  18. Bayesian image classification with deep convolutional Gaussian processes. In International Conference on Artificial Intelligence and Statistics, pp.  1529–1539. PMLR.
  19. Statistical evaluation of alternative models of human evolution. Proceedings of the National Academy of Sciences 104(45), 17614–17619.
  20. García-Donato, G. and M.-H. Chen (2005). Calibrating Bayes factor under prior predictive distributions. Statistica Sinica, 359–380.
  21. Model determination using predictive distributions with implementation via sampling-based methods. Bayesian statistics 4, 147–167.
  22. Gelfand, A. E. and A. F. Smith (1990). Sampling-based approaches to calculating marginal densities. Journal of the American statistical association 85(410), 398–409.
  23. Gelman, A. and X.-L. Meng (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical science, 163–185.
  24. Geweke, J. (1999). Using simulation methods for Bayesian econometric models: inference, development, and communication. Econometric reviews 18(1), 1–73.
  25. Geyer, C. J. (1994). Estimating normalizing constants and reweighting mixtures.
  26. Nonparametric Bayesian model selection and averaging.
  27. Generative adversarial networks. Communications of the ACM 63(11), 139–144.
  28. ABC likelihood-free methods for model choice in Gibbs random fields.
  29. Error probabilities in default Bayesian hypothesis testing. Journal of Mathematical Psychology 72, 130–143.
  30. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp.  297–304. JMLR Workshop and Conference Proceedings.
  31. Guttman, I. (1967). The use of the concept of a future observation in goodness-of-fit problems. Journal of the Royal Statistical Society: Series B (Methodological) 29(1), 83–100.
  32. The elements of statistical learning: data mining, inference, and prediction, Volume 2. Springer.
  33. Treebugs: An R package for hierarchical multinomial-processing-tree modeling. Behavior research methods 50, 264–284.
  34. A review of applications of the Bayes factor in psychological research. Psychological Methods 28(3), 558.
  35. Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural networks 4(2), 251–257.
  36. Scalable marginal likelihood estimation for model selection in deep learning. In International Conference on Machine Learning, pp.  4563–4573. PMLR.
  37. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pp.  448–456. pmlr.
  38. Jeffreys, H. (1935). Some tests of significance, treated by the theory of probability. In Mathematical proceedings of the Cambridge philosophical society, Volume 31, pp.  203–222. Cambridge University Press.
  39. Jeffreys, H. (1998). The theory of probability. OuP Oxford.
  40. Normalizing constant estimation with Gaussianized bridge sampling. In Symposium on Advances in Approximate Bayesian Inference, pp.  1–14. PMLR.
  41. Johnson, V. E. (2007). Bayesian model assessment using pivotal quantities.
  42. An adversarial approach to structural estimation. arXiv preprint arXiv:2007.06169.
  43. Metropolis–Hastings via classification. Journal of the American Statistical Association, 1–15.
  44. Kass, R. E. and A. E. Raftery (1995). Bayes factors. Journal of the american statistical association 90(430), 773–795.
  45. A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the american statistical association 90(431), 928–934.
  46. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  47. Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning, pp.  3744–3753. PMLR.
  48. Lee, M. D. and E.-J. Wagenmakers (2014). Bayesian cognitive modeling: A practical course. Cambridge university press.
  49. Lempers, F. B. (1971). Posterior Probabilities of Alternative Linear Models. University Press, Rotterdam.
  50. Relevant statistics for Bayesian model choice. Journal of the Royal Statistical Society Series B: Statistical Methodology 76(5), 833–859.
  51. Bayesian core: a practical approach to computational Bayesian statistics, Volume 268. Springer.
  52. Bayesian goodness of fit testing with mixtures of triangular distributions. Scandinavian Journal of Statistics 36(2), 337–354.
  53. Warp Bridge Sampling. Journal of Computational and Graphical Statistics 11(3), 552–586.
  54. Meng, X.-L. and W. H. Wong (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statistica Sinica, 831–860.
  55. Performance-based selection of likelihood models for phylogeny estimation. Systematic biology 52(5), 674–683.
  56. Learning in implicit generative models. arXiv preprint arXiv:1610.03483.
  57. Population predictive checks. arXiv preprint arXiv:1908.00882.
  58. f-GAN: Training generative neural samplers using variational divergence minimization. Advances in neural information processing systems 29.
  59. O’Hagan, A. (1995). Fractional Bayes factors for model comparison. Journal of the Royal Statistical Society: Series B (Methodological) 57(1), 99–118.
  60. Generative AI for Bayesian computation. arXiv preprint arXiv:2305.14972.
  61. Population growth of human Y chromosomes: A study of Y chromosome microsatellites. Molecular biology and evolution 16(12), 1791–1798.
  62. Reliable ABC model choice via random forests. Bioinformatics 32(6), 859–866.
  63. Qi, M. and G. P. Zhang (2001). An investigation of model selection criteria for neural network time series forecasting. European journal of operational research 132(3), 666–680.
  64. Raftery, A. E. (1996). Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83(2), 251–266.
  65. Estimating the integrated likelihood via posterior simulation using the harmonic mean identity.
  66. Rivers, A. M. (2017). The weapons identification task: Recommendations for adequately powered research. Plos one 12(6), e0177857.
  67. Lack of confidence in approximate Bayesian computation model choice. Proceedings of the National Academy of Sciences 108(37), 15112–15117.
  68. Asymptotic distribution of p values in composite null models. Journal of the American Statistical Association 95(452), 1143–1156.
  69. Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. The Annals of Statistics, 1151–1172.
  70. Rubin, D. B. (1987). Comment: A noniterative sampling/importance resampling alternative to the data augmentation algorithm for creating a few imputations when fractions of missing information are modest: The SIR algorithm. Journal of the American Statistical Association 82(398), 542–543.
  71. Savage, L. J. (1972). The foundations of statistics. Courier Corporation.
  72. Workflow techniques for the robust use of Bayes factors. Psychological Methods.
  73. Schönbrodt, F. D. and E.-J. Wagenmakers (2018). Bayes factor design analysis: Planning for compelling evidence. Psychonomic bulletin & review 25(1), 128–142.
  74. Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 461–464.
  75. Shannon, M. (2020). Properties of f-divergences and f-GAN training. arXiv preprint arXiv:2009.00757.
  76. Density ratio estimation in machine learning. Cambridge University Press.
  77. Approximate Bayesian computation. PLoS computational biology 9(1), e1002803.
  78. Inferring coalescence times from dna sequence data. Genetics 145(2), 505–518.
  79. Likelihood-free inference by ratio estimation. Bayesian Analysis 17(1), 1–31.
  80. Tierney, L. and J. B. Kadane (1986). Accurate approximations for posterior moments and marginal densities. Journal of the american statistical association 81(393), 82–86.
  81. Bayesian nonparametric goodness of fit tests. Frontiers of Statistical Decision Making and Bayesian Analysis, M.-H. Chen, DK Dey, P. Mueller, D. Sun, and K. Ye, Eds 469.
  82. Toni, T. and M. P. Stumpf (2010). Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics 26(1), 104–110.
  83. Computing Bayes factors using a generalization of the Savage-Dickey density ratio. Journal of the American Statistical Association 90(430), 614–618.
  84. Bayesian goodness-of-fit testing using infinite-dimensional exponential families. The Annals of Statistics 26(4), 1215–1241.
  85. Vlachos, P. K. and A. E. Gelfand (2003). On the calibration of Bayesian model choice criteria. Journal of statistical planning and inference 111(1-2), 223–234.
  86. On priors with a Kullback–Leibler property. Journal of the American Statistical Association 99(466), 404–408.
  87. Approximate Bayesian computation via classification. The Journal of Machine Learning Research 23(1), 15837–15885.
  88. Adversarial Bayesian Simulation. arXiv preprint arXiv:2208.12113.
  89. Xing, H. (2022). Improving bridge estimators via f-GAN. Statistics and Computing 32(5), 72.
  90. Deep Sets. Advances in neural information processing systems 30.
Citations (1)

Summary

We haven't generated a summary for this paper yet.