AdaFocal: Calibration-aware Adaptive Focal Loss (2211.11838v2)
Abstract: Much recent work has been devoted to the problem of ensuring that a neural network's confidence scores match the true probability of being correct, i.e. the calibration problem. Of note, it was found that training with focal loss leads to better calibration than cross-entropy while achieving similar level of accuracy \cite{mukhoti2020}. This success stems from focal loss regularizing the entropy of the model's prediction (controlled by the parameter $\gamma$), thereby reining in the model's overconfidence. Further improvement is expected if $\gamma$ is selected independently for each training sample (Sample-Dependent Focal Loss (FLSD-53) \cite{mukhoti2020}). However, FLSD-53 is based on heuristics and does not generalize well. In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies $\gamma_t$ for different groups of samples based on $\gamma_{t-1}$ from the previous step and the knowledge of model's under/over-confidence on the validation set. We evaluate AdaFocal on various image recognition and one NLP task, covering a wide variety of network architectures, to confirm the improvement in calibration while achieving similar levels of accuracy. Additionally, we show that models trained with AdaFocal achieve a significant boost in out-of-distribution detection.
- Glenn W. Brier. Verification of forecasts expressed in terms of probability. Monthly Weather Review, 1950.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
- On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, 2017.
- Calibration of neural networks using splines. In International Conference on Learning Representations, 2021.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations, 2019.
- Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
- Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration. In Advances in Neural Information Processing Systems, 2019.
- Verified uncertainty calibration. In Advances in Neural Information Processing Systems, volume 32, 2019.
- Aviral Kumar. 20 newsgroups mmce. https://github.com/aviralkumar2907/MMCE, 2018.
- Trainable calibration measures for neural networks from kernel mean embeddings. In Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, 2018.
- Ken Lang. Newsweeder: Learning to filter netnews. In in Proceedings of the 12th International Machine Learning Conference (ML95, 1995.
- Network in network. CoRR, 1312.4400, 2014.
- Focal loss for dense object detection. 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
- Conditional adversarial domain adaptation. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
- Jishnu Mukhoti. Focal calibration. https://github.com/torrvision/focal_calibration, 2020.
- Calibrating deep neural networks using focal loss. In Advances in Neural Information Processing Systems, 2020.
- When does label smoothing help? In Advances in Neural Information Processing Systems, 2019.
- Binary classifier calibration using an ensemble of near isotonic regression models. In Data Mining (ICDM), 2016 IEEE 16th International Conference on, 2016.
- Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, 2015.
- Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011.
- Posterior calibration and exploratory analysis for natural language processing models. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2015.
- GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2014.
- Mitigating bias in calibration error estimation, 2021.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, 2019.
- Evaluating model calibration in classification. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89, pages 3459–3467. PMLR, 16–18 Apr 2019.
- Rethinking calibration of deep neural networks: Do not be afraid of overconfidence. In Advances in Neural Information Processing Systems, volume 34, pages 11809–11820. Curran Associates, Inc., 2021.
- Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In ICML, 2001.
- Transforming classifier scores into accurate multiclass probability estimates. In KDD, 2002.
- Wide residual networks. In Proceedings of the British Machine Vision Conference (BMVC), 2016.
- Mix-n-match: Ensemble and compositional methods for uncertainty calibration in deep learning. In International Conference on Machine Learning, pages 11117–11128. PMLR, 2020.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.