Uncertainty Estimation by Fisher Information-based Evidential Deep Learning (2303.02045v3)
Abstract: Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. Recently proposed evidential neural networks explicitly account for different uncertainties by treating the network's outputs as evidence to parameterize the Dirichlet distribution, and achieve impressive performance in uncertainty estimation. However, for high data uncertainty samples but annotated with the one-hot label, the evidence-learning process for those mislabeled classes is over-penalized and remains hindered. To address this problem, we propose a novel method, Fisher Information-based Evidential Deep Learning ($\mathcal{I}$-EDL). In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes. The generalization ability of our network is further improved by optimizing the PAC-Bayesian bound. As demonstrated empirically, our proposed method consistently outperforms traditional EDL-related algorithms in multiple uncertainty estimation tasks, especially in the more challenging few-shot classification settings.
- Uncertainty in satellite estimates of global mean sea-level changes, trend and acceleration. Earth System Science Data, 11(3):1189–1202, 2019.
- On the properties of variational approximations of gibbs posteriors. The Journal of Machine Learning Research, 17(1):8374–8414, 2016.
- Deep evidential regression. Advances in Neural Information Processing Systems, 33:14927–14937, 2020.
- Evidential deep learning for open set action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13349–13358, 2021.
- Opental: Towards open set temporal action localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2979–2989, 2022.
- Pitfalls of epistemic uncertainty quantification through loss minimisation. In Advances in Neural Information Processing Systems, 2022.
- Weight uncertainty in neural network. In International conference on machine learning, pp. 1613–1622. PMLR, 2015.
- Posterior network: Uncertainty estimation without ood samples via density-based pseudo-counts. Advances in Neural Information Processing Systems, 33:1356–1367, 2020.
- Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 502–511, 2019.
- Deep learning for classical japanese literature. arXiv preprint arXiv:1812.01718, 2018.
- Correlated input-dependent label noise in large-scale image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1551–1560, 2021.
- Confidence-based reliable learning under dual noises. Advances in Neural Information Processing Systems, 35:35116–35129, 2022.
- Dempster, A. P. A generalization of bayesian inference. Journal of the Royal Statistical Society: Series B (Methodological), 30(2):205–232, 1968.
- Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection. In 2018 21st international conference on intelligent transportation systems (ITSC), pp. 3266–3273. IEEE, 2018.
- Gal, Y. Uncertainty in Deep Learning. PhD thesis, University of Cambridge, 2016.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pp. 1050–1059. PMLR, 2016.
- Deep bayesian active learning with image data. In International Conference on Machine Learning, pp. 1183–1192. PMLR, 2017.
- Pac-bayesian learning of linear classifiers. In Proceedings of the 26th Annual International Conference on Machine Learning, pp. 353–360, 2009.
- On the importance of firth bias reduction in few-shot classification. arXiv preprint arXiv:2110.02529, 2021.
- Graves, A. Practical variational inference for neural networks. Advances in neural information processing systems, 24, 2011.
- On calibration of modern neural networks. In International conference on machine learning, pp. 1321–1330. PMLR, 2017.
- Deal: deep evidential active learning for image classification. In Deep Learning Applications, Volume 3, pp. 171–192. Springer, 2022.
- Jøsang, A. Artificial reasoning with subjective logic. In Proceedings of the second Australian workshop on commonsense reasoning, volume 48, pp. 34. Citeseer, 1997.
- Jøsang, A. Subjective logic, volume 3. Springer, 2016.
- Soft calibration objectives for neural networks. Advances in Neural Information Processing Systems, 34:29768–29779, 2021.
- Learnable uncertainty under laplace approximations. In Uncertainty in Artificial Intelligence, pp. 344–353. PMLR, 2021.
- Being a bit frequentist improves bayesian neural networks. In International Conference on Artificial Intelligence and Statistics, pp. 529–545. PMLR, 2022.
- Learning multiple layers of features from tiny images. cs.toronto.edu, 2009.
- Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30, 2017.
- LeCun, Y. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
- Theory of point estimation. Springer Science & Business Media, 2006.
- Enhancing the reliability of out-of-distribution image detection in neural networks. In International Conference on Learning Representations, 2018.
- A complete recipe for stochastic gradient mcmc. Advances in neural information processing systems, 28, 2015.
- Predictive uncertainty estimation via prior networks. Advances in neural information processing systems, 31, 2018.
- Reverse kl-divergence training of prior networks: Improved uncertainty and adversarial robustness. Advances in Neural Information Processing Systems, 32, 2019.
- Masegosa, A. Learning under model misspecification: Applications to variational and ensemble methods. Advances in Neural Information Processing Systems, 33:5479–5491, 2020.
- McAllester, D. A. Some pac-bayesian theorems. Machine Learning, 37(3):355–363, 1999.
- The unreasonable effectiveness of deep evidential regression. arXiv preprint arXiv:2205.10060, 2022.
- Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation. Medical image analysis, 59:101557, 2020.
- Towards maximizing the representation gap between in-domain & out-of-distribution examples. Advances in Neural Information Processing Systems, 33:9239–9250, 2020.
- The street view house numbers (svhn) dataset. Technical report, Technical report, Accessed 2016-08-01.[Online], 2018.
- Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information processing systems, 32, 2019.
- Multidimensional belief quantification for label-efficient meta-learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14391–14400, 2022.
- Regularizing neural networks by penalizing confident output distributions. arXiv preprint arXiv:1701.06548, 2017.
- Dataset shift in machine learning. Mit Press, 2008.
- Meta-learning for semi-supervised few-shot classification. arXiv preprint arXiv:1803.00676, 2018.
- A scalable laplace approximation for neural networks. In 6th International Conference on Learning Representations, ICLR 2018-Conference Track Proceedings, volume 6. International Conference on Representation Learning, 2018.
- Mitigating bias in calibration error estimation. In International Conference on Artificial Intelligence and Statistics, pp. 4036–4054. PMLR, 2022.
- Detecting out-of-distribution examples with gram matrices. In International Conference on Machine Learning, pp. 8491–8501. PMLR, 2020.
- Exploiting epistemic uncertainty of anatomy segmentation for anomaly detection in retinal oct. IEEE transactions on medical imaging, 39(1):87–98, 2019.
- Evidential deep learning to quantify classification uncertainty. Advances in Neural Information Processing Systems, 31, 2018.
- Combination of evidence in dempster-shafer theory. US Department of Energy, 2002.
- Shafer, G. A mathematical theory of evidence, volume 42. Princeton university press, 1976.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Evidential deep learning for guided molecular property prediction and discovery. ACS central science, 7(8):1356–1367, 2021.
- Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826, 2016.
- Energy statistics: A class of statistics based on distances. Journal of statistical planning and inference, 143(8):1249–1272, 2013.
- Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
- The caltech-ucsd birds-200-2011 dataset. caltech.edu, 2011.
- Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11), pp. 681–688. Citeseer, 2011.
- Bayesian deep learning and a probabilistic perspective of generalization. Advances in neural information processing systems, 33:4697–4708, 2020.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
- Free lunch for few-shot learning: Distribution calibration. arXiv preprint arXiv:2101.06395, 2021.
- Neural ensemble search for uncertainty estimation and dataset shift. Advances in Neural Information Processing Systems, 34:7898–7911, 2021.
- Uncertainty aware semi-supervised learning on graph data. Advances in Neural Information Processing Systems, 33:12827–12836, 2020.