Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bayesian posterior approximation with stochastic ensembles (2212.08123v3)

Published 15 Dec 2022 in cs.LG, cs.CV, and stat.ML

Abstract: We introduce ensembles of stochastic neural networks to approximate the Bayesian posterior, combining stochastic methods such as dropout with deep ensembles. The stochastic ensembles are formulated as families of distributions and trained to approximate the Bayesian posterior with variational inference. We implement stochastic ensembles based on Monte Carlo dropout, DropConnect and a novel non-parametric version of dropout and evaluate them on a toy problem and CIFAR image classification. For both tasks, we test the quality of the posteriors directly against Hamiltonian Monte Carlo simulations. Our results show that stochastic ensembles provide more accurate posterior estimates than other popular baselines for Bayesian inference.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion, 76:243–297, 2021.
  2. Ensemble learning in bayesian neural networks. In Neural Networks and Machine Learning, pages 215–237. Springer, 1998.
  3. Ensemble learning in bayesian neural networks. 1998.
  4. Weight uncertainty in neural networks. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, page 1613–1622. JMLR.org, 2015.
  5. Improving Ensemble Robustness by Collaboratively Promoting and Demoting Adversarial Robustness. arXiv e-prints, page arXiv:2009.09612, Sept. 2020.
  6. Bayesian back-propagation. Complex Syst., 5, 1991.
  7. Asa Cooper Stickland and Iain Murray. Diverse Ensembles Improve Calibration. arXiv e-prints, page arXiv:2007.04206, July 2020.
  8. Large automatic learning, rule extraction, and generalization. Complex Syst., 1, 1987.
  9. Repulsive deep ensembles are bayesian. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 3451–3465. Curran Associates, Inc., 2021.
  10. Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping. arXiv e-prints, page arXiv:2206.03633, June 2022.
  11. Is mc dropout bayesian?, 2021.
  12. Deep Ensembles: A Loss Landscape Perspective. arXiv e-prints, page arXiv:1912.02757, Dec. 2019.
  13. Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference. arXiv e-prints, page arXiv:1506.02158, June 2015.
  14. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1050–1059, New York, New York, USA, 20–22 Jun 2016. PMLR.
  15. Concrete dropout. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  16. A Survey of Uncertainty in Deep Neural Networks. arXiv e-prints, page arXiv:2107.03342, July 2021.
  17. A survey of uncertainty in deep neural networks, 2022.
  18. Alex Graves. Practical variational inference for neural networks. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011.
  19. Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision. arXiv e-prints, page arXiv:1906.01620, June 2019.
  20. Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell., 12(10):993–1001, oct 1990.
  21. Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations, 2019.
  22. Geoffrey E. Hinton and Drew van Camp. Keeping the neural networks simple by minimizing the description length of the weights. In Proceedings of the Sixth Annual Conference on Computational Learning Theory, COLT ’93, page 5–13, New York, NY, USA, 1993. Association for Computing Machinery.
  23. The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo, 2011.
  24. Deep Ensembles from a Bayesian Perspective. arXiv e-prints, page arXiv:2105.13283, May 2021.
  25. Averaging weights leads to wider optima and better generalization. ArXiv, abs/1803.05407, 2018.
  26. What Are Bayesian Neural Network Posteriors Really Like? arXiv e-prints, page arXiv:2104.14421, Apr. 2021.
  27. Hands-on bayesian neural networks—a tutorial for deep learning users. IEEE Computational Intelligence Magazine, 17(2):29–48, 2022.
  28. Variational Dropout and the Local Reparameterization Trick. arXiv e-prints, page arXiv:1506.02557, June 2015.
  29. Deep interpretable ensembles. arXiv e-prints, page arXiv:2205.12729, May 2022.
  30. Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
  31. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. arXiv e-prints, page arXiv:1612.01474, Dec. 2016.
  32. Dropout Inference in Bayesian Neural Networks with Alpha-divergences. arXiv e-prints, page arXiv:1703.02914, Mar. 2017.
  33. Looking at the posterior: on the origin of uncertainty in neural-network classification, 2022.
  34. Multiplicative normalizing flows for variational Bayesian neural networks. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 2218–2227. PMLR, 06–11 Aug 2017.
  35. David MacKay. Bayesian model comparison and backprop nets. In J. Moody, S. Hanson, and R.P. Lippmann, editors, Advances in Neural Information Processing Systems, volume 4. Morgan-Kaufmann, 1991.
  36. D. MacKay. Bayesian Methods for Adaptive Models. PhD thesis, California Institute of Technology, 1992.
  37. D. J. C. Mackay. Information-based objective functions for active data selection. Neural Computation, 4(2):550–604, 1992.
  38. A simple baseline for bayesian uncertainty in deep learning. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  39. Improving robustness and calibration in ensembles with diversity regularization. arXiv preprint arXiv:2201.10908, 2022.
  40. DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks. arXiv e-prints, page arXiv:1906.04569, June 2019.
  41. Diversity Matters When Learning From Ensembles. arXiv e-prints, page arXiv:2110.14149, Oct. 2021.
  42. Radford M. Neal. Bayesian Learning for Neural Networks. Springer-Verlag, Berlin, Heidelberg, 1996.
  43. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  44. A survey of deep active learning. ACM Comput. Surv., 54(9), oct 2021.
  45. Claude Elwood Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379–423, 1948.
  46. Bayesian optimization with robust bayesian neural networks. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016.
  47. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014.
  48. Supplementary material.
  49. Consistent inference of probabilities in layered networks: predictions and generalizations. In International 1989 Joint Conference on Neural Networks, pages 403–409 vol.2, 1989.
  50. Greedy bayesian posterior approximation with deep ensembles. arXiv preprint arXiv:2105.14275, 2021.
  51. Regularization of neural networks using dropconnect. In International conference on machine learning, pages 1058–1066. PMLR, 2013.
  52. Bayesian learning via stochastic gradient langevin dynamics. In ICML, 2011.
  53. BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning. arXiv e-prints, page arXiv:2002.06715, Feb. 2020.
  54. Hyperparameter ensembles for robustness and uncertainty quantification, 2020.
  55. Bayesian deep learning and a probabilistic perspective of generalization. Advances in neural information processing systems, 33:4697–4708, 2020.
  56. Deterministic variational inference for robust bayesian neural networks. arXiv preprint arXiv:1810.03958, 2018.
  57. Deterministic Variational Inference for Robust Bayesian Neural Networks. arXiv e-prints, page arXiv:1810.03958, Oct. 2018.
  58. Neural ensemble search for uncertainty estimation and dataset shift. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 7898–7911. Curran Associates, Inc., 2021.
Citations (5)

Summary

We haven't generated a summary for this paper yet.