Calibration-Aware Bayesian Learning (2305.07504v2)
Abstract: Deep learning models, including modern systems like LLMs, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent regularizers have been recently introduced in the context of conventional frequentist learning to penalize deviations between confidence and accuracy. In contrast, data-independent regularizers are at the core of Bayesian learning, enforcing adherence of the variational distribution in the model parameter space to a prior density. The former approach is unable to quantify epistemic uncertainty, while the latter is severely affected by model misspecification. In light of the limitations of both methods, this paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs), that applies both regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
- E. Hüllermeier and W. Waegeman, “Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods,” Machine Learning, vol. 110, pp. 457–506, 2021.
- OpenAI, “GPT-4 technical report,” arXiv preprint arXiv:2303.08774, 2023.
- C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in ICML, 2017, pp. 1321–1330.
- J. Knoblauch, J. Jewson, and T. Damoulas, “Generalized variational inference: Three arguments for deriving new posteriors,” arXiv preprint arXiv:1904.02063, 2019.
- A. Masegosa, “Learning under model misspecification: Applications to variational and ensemble methods,” NeurIPS, vol. 33, pp. 5479–5491, 2020.
- F. Wenzel et al., “How good is the Bayes posterior in deep neural networks really?” arXiv preprint arXiv:2002.02405, 2020.
- A. Kumar, S. Sarawagi, and U. Jain, “Trainable calibration measures for neural networks from kernel mean embeddings,” in ICML, 2018, pp. 2805–2814.
- O. Bohdal, Y. Yang, and T. Hospedales, “Meta-calibration: Learning of model calibration using differentiable expected calibration error,” arXiv preprint arXiv:2106.09613, 2021.
- S. Park, K. M. Cohen, and O. Simeone, “Few-shot calibration of set predictors via meta-learned cross-validation-based conformal prediction,” arXiv preprint arXiv:2210.03067, 2022.
- E. Angelino, M. J. Johnson, and R. P. Adams, “Patterns of scalable Bayesian inference,” Foundations and Trends® in Machine Learning, vol. 9, no. 2-3, pp. 119–247, 2016.
- C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural network,” in ICML, 2015, pp. 1613–1622.
- S. Ravi and A. Beatson, “Amortized bayesian meta-learning,” in International Conference on Learning Representations, 2019.
- M. Zecchin, S. Park, O. Simeone, M. Kountouris, and D. Gesbert, “Robust PACmsuperscriptPAC𝑚\text{PAC}^{m}PAC start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT: Training ensemble models under model misspecification and outliers,” arXiv preprint arXiv:2203.01859, 2022.
- M. Cuturi, O. Teboul, and J.-P. Vert, “Differentiable ranking and sorting using optimal transport,” NeurIPS, vol. 32, 2019.
- K. Lang, “Newsweeder: Learning to filter netnews,” in Machine learning proceedings 1995. Elsevier, 1995, pp. 331–339.
- A. Krizhevsky, V. Nair, and G. Hinton, “Cifar-10 (Canadian institute for advanced research),” 2010.
- M. Lin, Q. Chen, and S. Yan, “Network in network,” arXiv preprint arXiv:1312.4400, 2013.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in CVPR, 2016, pp. 770–778.
- Z. Chen and K. Pattabiraman, “Overconfidence is a dangerous thing: Mitigating membership inference attacks by enforcing less confident prediction,” arXiv preprint arXiv:2307.01610, 2023.