Probabilistic Calibration by Design for Neural Network Regression (2403.11964v1)
Abstract: Generating calibrated and sharp neural network predictive distributions for regression problems is essential for optimal decision-making in many real-world applications. To address the miscalibration issue of neural networks, various methods have been proposed to improve calibration, including post-hoc methods that adjust predictions after training and regularization methods that act during training. While post-hoc methods have shown better improvement in calibration compared to regularization methods, the post-hoc step is completely independent of model training. We introduce a novel end-to-end model training procedure called Quantile Recalibration Training, integrating post-hoc calibration directly into the training process without additional parameters. We also present a unified algorithm that includes our method and other post-hoc and regularization methods, as particular cases. We demonstrate the performance of our method in a large-scale experiment involving 57 tabular regression datasets, showcasing improved predictive accuracy while maintaining calibration. We also conduct an ablation study to evaluate the significance of different components within our proposed method, as well as an in-depth analysis of the impact of the base model and different hyperparameters on predictive accuracy.
- “A review of uncertainty quantification in deep learning: Techniques, applications and challenges” In An international journal on information fusion 76, 2021, pp. 243–297 DOI: 10.1016/j.inffus.2021.05.008
- “Deep Evidential Regression”, 2019, pp. 14927–14937 arXiv: https://proceedings.neurips.cc/paper/2020/hash/aab085461de182608ee9f607f3f7d18f-Abstract.html
- Edmon Begoli, Tanmoy Bhattacharya and Dimitri Kusnezov “The need for uncertainty quantification in machine-assisted medical decision making” In Nature Machine Intelligence 1.1 Nature Publishing Group, 2019, pp. 20–23 DOI: 10.1038/s42256-018-0004-1
- Alessio Benavoli, Giorgio Corani and Francesca Mangili “Should We Really Use Post-Hoc Tests Based on Mean-Ranks?” In Journal of machine learning research: JMLR 17.5, 2016, pp. 1–10 URL: https://jmlr.org/papers/v17/benavoli16a.html
- “Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing”, 2023 arXiv: http://arxiv.org/abs/2309.12236
- Victor Chernozhukov, Kaspar Wüthrich and Yinchu Zhu “Distributional conformal prediction” In Proceedings of the National Academy of Sciences of the United States of America 118.48 National Acad Sciences, 2021 DOI: 10.1073/pnas.2107794118
- “Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification” In Advances in Neural Information Processing Systems 34 Curran Associates, Inc., 2021, pp. 10971–10984 URL: https://proceedings.neurips.cc/paper/2021/file/5b168fdba5ee5ea262cc2d4c0b457697-Paper.pdf
- Janez Demšar “Statistical Comparisons of Classifiers over Multiple Data Sets” In Journal of machine learning research: JMLR 7 JMLR.org, 2006, pp. 1–30 URL: https://dl.acm.org/doi/10.5555/1248547.1248548
- Victor Dheur and Souhaib Ben Taieb “A Large-Scale Study of Probabilistic Calibration in Neural Network Regression” In The 40th International Conference on Machine Learning PMLR, 2023 URL: https://arxiv.org/abs/2306.02738
- Hailiang Du “Beyond Strictly Proper Scoring Rules: The Importance of Being Local” In Weather and Forecasting 36.2 American Meteorological Society, 2021, pp. 457–468 DOI: 10.1175/WAF-D-19-0205.1
- “UCI Machine Learning Repository”, 2017 URL: http://archive.ics.uci.edu/ml
- “Training Uncertainty-Aware Classifiers with Conformalized Deep Learning”, 2022 arXiv: http://arxiv.org/abs/2205.05878
- Milton Friedman “A Comparison of Alternative Tests of Significance for the Problem of m𝑚mitalic_m Rankings” In The Annals of Mathematical Statistics 11.1 Institute of Mathematical Statistics, 1940, pp. 86–92 DOI: 10.1214/aoms/1177731944
- “A Survey of Uncertainty in Deep Neural Networks”, 2021 arXiv: http://arxiv.org/abs/2107.03342
- “An Open Source AutoML Benchmark”, 2019 arXiv: http://arxiv.org/abs/1907.00909
- Tilmann Gneiting, Fadoua Balabdaoui and Adrian E Raftery “Probabilistic forecasts, calibration and sharpness” In Journal of the Royal Statistical Society. Series B, Statistical methodology 69.2 Wiley, 2007, pp. 243–268 DOI: 10.1111/j.1467-9868.2007.00587.x
- “Revisiting deep learning models for tabular data” In Advances in neural information processing systems 34 proceedings.neurips.cc, 2021, pp. 18932–18943 URL: https://proceedings.neurips.cc/paper/2021/hash/9d86d83f925f2149e9edb0ac3b49229c-Abstract.html
- Léo Grinsztajn, Edouard Oyallon and Gaël Varoquaux “Why do tree-based models still outperform deep learning on tabular data?”, 2022 arXiv: http://arxiv.org/abs/2207.08815
- “Stochastic Optimization of Sorting Networks via Continuous Relaxations” In International Conference on Learning Representations, 2019 URL: https://openreview.net/forum?id=H1eSS3CcKX
- “On Calibration of Modern Neural Networks” In Proceedings of the 34th International Conference on Machine Learning 70, Proceedings of Machine Learning Research PMLR, 2017, pp. 1321–1330 URL: https://proceedings.mlr.press/v70/guo17a.html
- “Calibration of Neural Networks using Splines”, 2020 arXiv: http://arxiv.org/abs/2006.12800
- Sture Holm “A Simple Sequentially Rejective Multiple Test Procedure” In Scandinavian journal of statistics, theory and applications 6.2 [Board of the Foundation of the Scandinavian Journal of Statistics, Wiley], 1979, pp. 65–70 URL: http://www.jstor.org/stable/4615733
- Rafael Izbicki, Gilson Shimizu and Rafael Stern “Flexible distribution-free conditional predictive bands using density estimators” In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics 108, Proceedings of Machine Learning Research PMLR, 2020, pp. 3068–3077 URL: https://proceedings.mlr.press/v108/izbicki20a.html
- “Soft calibration objectives for neural networks” In Advances in neural information processing systems 34 proceedings.neurips.cc, 2021, pp. 29768–29779 URL: https://proceedings.neurips.cc/paper_files/paper/2021/hash/f8905bd3df64ace64a68e154ba72f24c-Abstract.html
- “Lessons learned in the challenge: Making predictions and scoring them” In Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment 95, Lecture notes in computer science Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 95–116 DOI: 10.1007/11736790\_7
- “Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation” In Proceedings of the 39th International Conference on Machine Learning 162, Proceedings of Machine Learning Research PMLR, 2022, pp. 11683–11693 URL: https://proceedings.mlr.press/v162/kuleshov22a.html
- Volodymyr Kuleshov, Nathan Fenner and Stefano Ermon “Accurate Uncertainties for Deep Learning Using Calibrated Regression” In Proceedings of the 35th International Conference on Machine Learning 80, Proceedings of Machine Learning Research PMLR, 2018, pp. 2796–2804 URL: https://proceedings.mlr.press/v80/kuleshov18a.html
- Aviral Kumar, Sunita Sarawagi and Ujjwal Jain “Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings” In Proceedings of the 35th International Conference on Machine Learning 80, Proceedings of Machine Learning Research PMLR, 2018, pp. 2805–2814 URL: https://proceedings.mlr.press/v80/kumar18a.html
- Kumar, Liang and Ma “Verified uncertainty calibration” In Advances in neural information processing systems proceedings.neurips.cc, 2019 URL: https://proceedings.neurips.cc/paper/2019/hash/f8c0c968632845cd133308b1a494967f-Abstract.html
- Lakshminarayanan and Pritzel “Simple and scalable predictive uncertainty estimation using deep ensembles” In Advances in neural information processing systems proceedings.neurips.cc, 2017 URL: https://proceedings.neurips.cc/paper/2017/hash/9ef2ed4b7fd2c810847ffa5fa85bce38-Abstract.html
- Rhiannon Michelmore, Marta Kwiatkowska and Yarin Gal “Evaluating Uncertainty Quantification in End-to-End Autonomous Driving Control”, 2018 arXiv: http://arxiv.org/abs/1811.06817
- “Neural Importance Sampling” In ACM transactions on graphics 38.5 New York, NY, USA: Association for Computing Machinery, 2019, pp. 1–19 DOI: 10.1145/3341156
- “High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach” In Proceedings of the 35th International Conference on Machine Learning 80, Proceedings of Machine Learning Research PMLR, 2018, pp. 4075–4084 URL: https://proceedings.mlr.press/v80/pearce18a.html
- “Regularizing Neural Networks by Penalizing Confident Output Distributions”, 2017 arXiv: http://arxiv.org/abs/1701.06548
- Teodora Popordanoska, Raphael Sayer and Matthew Blaschko “A consistent and differentiable lp canonical calibration error estimator” In Advances in neural information processing systems, 2022
- Yaniv Romano, Evan Patterson and Emmanuel Candes “Conformalized quantile regression” In Advances in neural information processing systems proceedings.neurips.cc, 2019 URL: https://proceedings.neurips.cc/paper/2019/hash/5103c3584b063c431bd1268e9b5e76fb-Abstract.html
- David W Scott “On optimal and data-based histograms” In Biometrika 66.3 Oxford Academic, 1979, pp. 605–610 DOI: 10.1093/biomet/66.3.605
- “Distribution calibration for regression” In Proceedings of the 36th International Conference on Machine Learning 97, Proceedings of Machine Learning Research PMLR, 2019, pp. 5897–5906 URL: https://proceedings.mlr.press/v97/song19a.html
- “Learning Optimal Conformal Classifiers”, 2022 URL: https://openreview.net/pdf?id=t8O-4LKFVx
- “Quantile Regularization: Towards Implicit Calibration of Regression Models”, 2020 arXiv: http://arxiv.org/abs/2002.12860
- Oldrich Vasicek “A Test for Normality Based on Sample Entropy” In Journal of the Royal Statistical Society. Series B, Statistical methodology 38.1 [Royal Statistical Society, Wiley], 1976, pp. 54–59 URL: http://www.jstor.org/stable/2984828
- Vladimir Vovk, Alexander Gammerman and Glenn Shafer “Algorithmic Learning in a Random World” Springer International Publishing, 2005 DOI: 10.1007/978-3-031-06649-8
- Deng-Bao Wang, Lei Feng and Min-Ling Zhang “Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence” In Advances in Neural Information Processing Systems 34 Curran Associates, Inc., 2021, pp. 11809–11820 URL: https://proceedings.neurips.cc/paper_files/paper/2021/file/61f3a6dbc9120ea78ef75544826c814e-Paper.pdf
- Frank Wilcoxon “Individual Comparisons by Ranking Methods” In Biometrics Bulletin 1.6 [International Biometric Society, Wiley], 1945, pp. 80–83 DOI: 10.2307/3001968
- “ESD: Expected Squared Difference as a Tuning-Free Trainable Calibration Measure” In The Eleventh International Conference on Learning Representations, 2023 URL: https://openreview.net/forum?id=bHW9njOSON
- “mixup: Beyond Empirical Risk Minimization” In International Conference on Learning Representations, 2018 URL: https://openreview.net/forum?id=r1Ddp1-Rb
- Shengjia Zhao, Tengyu Ma and Stefano Ermon “Individual Calibration with Randomized Forecasting” In Proceedings of the 37th International Conference on Machine Learning 119, Proceedings of Machine Learning Research PMLR, 2020, pp. 11387–11397 URL: https://proceedings.mlr.press/v119/zhao20e.html
- Victor Dheur (6 papers)
- Souhaib Ben Taieb (18 papers)