Accuracy-Preserving Calibration via Statistical Modeling on Probability Simplex
Abstract: Classification models based on deep neural networks (DNNs) must be calibrated to measure the reliability of predictions. Some recent calibration methods have employed a probabilistic model on the probability simplex. However, these calibration methods cannot preserve the accuracy of pre-trained models, even those with a high classification accuracy. We propose an accuracy-preserving calibration method using the Concrete distribution as the probabilistic model on the probability simplex. We theoretically prove that a DNN model trained on cross-entropy loss has optimality as the parameter of the Concrete distribution. We also propose an efficient method that synthetically generates samples for training probabilistic models on the probability simplex. We demonstrate that the proposed method can outperform previous methods in accuracy-preserving calibration tasks using benchmarks. The code is available at https://github.com/ToyotaCRDL/SimplexTS.
- Multi-sample ζ𝜁\zetaitalic_ζ-mixup: Richer, more realistic synthetic samples from a p𝑝pitalic_p-series interpolant. arXiv preprint arXiv: 2204.03323.
- Adaptive temperature scaling for robust calibration of deep neural networks. arXiv preprint arXiv: 2208.00461.
- Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In International Conference on Computational Statistics, pages 177–186. Physica-Verlag HD.
- A theoretical analysis of feature pooling in visual recognition. In International Conference on Machine Learning, page 111â118. Omnipress.
- Bulatov, Y. (2011). notMNIST dataset. Google (Books/OCR), Tech. Rep. https://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html, Accessed on February 1, 2024.
- An analysis of single-layer networks in unsupervised feature learning. In International Conference on Artificial Intelligence and Statistics, volume 15, pages 215–223. PMLR.
- Elements of Information Theory. Wiley-Interscience.
- Classifier calibration: A survey on how to assess and improve predicted class probabilities. arXiv preprint arXiv: 2112.10327.
- A survey of uncertainty in deep neural networks. arXiv preprint arXiv: 2107.03342.
- Deep sparse rectifier neural networks. In International Conference on Artificial Intelligence and Statistics, volume 15, pages 315–323. PMLR.
- Gumbel, E. J. (1941). The return period of flood flows. The Annals of Mathematical Statistics, 12(2):163–190.
- On calibration of modern neural networks. In International Conference on Machine Learning, volume 70, pages 1321–1330. PMLR.
- Mixup as locally linear out-of-manifold regularization. AAAI Conference on Artificial Intelligence, 33(01):3714–3722.
- Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Machine Learning, 110:457–506.
- Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition.
- Densely connected convolutional networks. In Conference on Computer Vision and Pattern Recognition.
- Inoue, H. (2018). Data augmentation by pairing samples for images classification. arXiv preprint arXiv: 1801.02929.
- Categorical reparameterization with gumbel-softmax. In International Conference on Learning Representations.
- Joy, T. (2022). Adaptive temperature scaling [Source code]. https://github.com/thwjoy/adats, Accessed on July 21, 2023.
- Sample-dependent adaptive temperature scaling for improved calibration. AAAI Conference on Artificial Intelligence, 37(12):14919–14926.
- Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980.
- Being bayesian, even just a bit, fixes overconfidence in ReLU networks. In International Conference on Machine Learning, volume 119, pages 5436–5446. PMLR.
- Learning multiple layers of features from tiny images. Technical report, University of Toronto.
- Calibrated and sharp uncertainties in deep learning via density estimation. In International Conference on Machine Learning, volume 162, pages 11683–11693. PMLR.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.
- A general framework for ensemble distribution distillation. In International Workshop on Machine Learning for Signal Processing, pages 1–6.
- SGDR: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations.
- The concrete distribution: A continuous relaxation of discrete random variables. In International Conference on Learning Representations.
- Predictive uncertainty estimation via prior networks. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- Ensemble distribution distillation. In International Conference on Learning Representations.
- Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.
- PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- Polak, E. (1997). Optimization : Algorithms and Consistent Approximations. Springer-Verlag.
- Powers, D. M. W. (2020). Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. arXiv preprint arXiv: 2010.16061.
- Density estimation in representation space to predict model uncertainty. In Engineering Dependable and Secure Machine Learning Systems, pages 84–96. Springer International Publishing.
- Scaling ensemble distribution distillation to many classes with proxy targets. In Advances in Neural Information Processing Systems, volume 34, pages 6023–6035. Curran Associates, Inc.
- Evidential deep learning to quantify classification uncertainty. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409.1556.
- A survey on deep transfer learning. In International Conference on Artificial Neural Networks, pages 270–279. Springer International Publishing.
- On mixup training: Improved calibration and predictive uncertainty for deep neural networks. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- Parameterized temperature scaling for boosting the expressive power in post-hoc uncertainty calibration. In European Conference on Computer Vision, pages 555–569. Springer Nature Switzerland.
- Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, volume 97, pages 6438–6447. PMLR.
- Non-parametric calibration for classification. In International Conference on Artificial Intelligence and Statistics, volume 108, pages 178–190. PMLR.
- Bayesian deep learning and a probabilistic perspective of generalization. In Advances in Neural Information Processing Systems, volume 33, pages 4697–4708. Curran Associates, Inc.
- Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv: 1708.07747. The MIT License (MIT) Copyright © 2017 Zalando SE, https://tech.zalando.com.
- Generalized out-of-distribution detection: A survey. arXiv preprint arXiv: 2110.11334.
- Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv: 1506.03365.
- mixup: Beyond empirical risk minimization. In International Conference on Learning Representations.
- Mix-n-Match : Ensemble and compositional methods for uncertainty calibration in deep learning. In International Conference on Machine Learning, volume 119, pages 11117–11128. PMLR.
- Confidence regularized self-training. In IEEE/CVF International Conference on Computer Vision.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.