Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Estimating Epistemic and Aleatoric Uncertainty with a Single Model (2402.03478v2)

Published 5 Feb 2024 in cs.LG and cs.CV

Abstract: Estimating and disentangling epistemic uncertainty, uncertainty that is reducible with more training data, and aleatoric uncertainty, uncertainty that is inherent to the task at hand, is critically important when applying machine learning to high-stakes applications such as medical imaging and weather forecasting. Conditional diffusion models' breakthrough ability to accurately and efficiently sample from the posterior distribution of a dataset now makes uncertainty estimation conceptually straightforward: One need only train and sample from a large ensemble of diffusion models. Unfortunately, training such an ensemble becomes computationally intractable as the complexity of the model architecture grows. In this work we introduce a new approach to ensembling, hyper-diffusion models (HyperDM), which allows one to accurately estimate both epistemic and aleatoric uncertainty with a single model. Unlike existing single-model uncertainty methods like Monte-Carlo dropout and Bayesian neural networks, HyperDM offers prediction accuracy on par with, and in some cases superior to, multi-model ensembles. Furthermore, our proposed approach scales to modern network architectures such as Attention U-Net and yields more accurate uncertainty estimates compared to existing methods. We validate our method on two distinct real-world tasks: x-ray computed tomography reconstruction and weather temperature forecasting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Deep posterior sampling: Uncertainty quantification for large scale inverse problems. 2019.
  2. Agarap, A. F. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018.
  3. Hyperstyle: Stylegan inversion with hypernetworks for real image editing. In Proceedings of the IEEE/CVF conference on computer Vision and pattern recognition, pp.  18511–18521, 2022.
  4. Accurate medium-range global weather forecasting with 3d neural networks. Nature, 619(7970):533–538, 2023.
  5. A brief review of hypernetworks in deep learning. arXiv preprint arXiv:2306.06955, 2023.
  6. Machine learning techniques for ct imaging diagnosis of novel coronavirus pneumonia: A review. Neural Computing and Applications, pp.  1–19, 2022.
  7. Decomposition of uncertainty in bayesian deep learning for efficient and risk-sensitive learning. In International Conference on Machine Learning, pp. 1184–1193. PMLR, 2018.
  8. Uncertainty quantification for deep unrolling-based computational imaging. IEEE Transactions on Computational Imaging, 8:1195–1209, 2022.
  9. Quantifying generative model uncertainty in posterior sampling methods for computational imaging. In NeurIPS 2023 Workshop on Deep Learning and Inverse Problems, 2023.
  10. Score-based diffusion models as principled priors for inverse imaging. In International Conference on Computer Vision (ICCV). IEEE, 2023.
  11. Gal, Y. Uncertainty in Deep Learning. PhD thesis, University of Cambridge, 2016.
  12. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pp. 1050–1059. PMLR, 2016.
  13. Hypernetworks, 2016.
  14. Controlled dropout for uncertainty estimation. arXiv preprint arXiv:2205.03109, 2022.
  15. The era5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020.
  16. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  17. Image quality metrics: Psnr vs. ssim. In 2010 20th International Conference on Pattern Recognition, pp.  2366–2369, 2010. doi: 10.1109/ICPR.2010.579.
  18. Hounsfield, G. N. Computerized transverse axial scanning (tomography): Part 1. description of system. The British journal of radiology, 46(552):1016–1022, 1973.
  19. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
  20. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine Learning, 110:457–506, 2021.
  21. Huszár, F. Variational inference using implicit distributions. arXiv preprint arXiv:1702.08235, 2017.
  22. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  23. What uncertainties do we need in bayesian deep learning for computer vision? Advances in neural information processing systems, 30, 2017.
  24. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  25. Predictive uncertainty quantification with compound density networks. arXiv preprint arXiv:1902.01080, 2019.
  26. Bayesian hypernetworks. arXiv preprint arXiv:1710.04759, 2017.
  27. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30, 2017.
  28. Learning skillful medium-range global weather forecasting. Science, pp.  eadi2336, 2023.
  29. A kernelized stein discrepancy for goodness-of-fit tests and model evaluation, 2016.
  30. MacKay, D. J. A practical bayesian framework for backpropagation networks. Neural computation, 4(3):448–472, 1992.
  31. Score-based diffusion models for bayesian image reconstruction. arXiv preprint arXiv:2305.16482, 2023.
  32. Neal, R. M. Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012.
  33. Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv preprint arXiv:2202.11214, 2022.
  34. Implicit weight uncertainty in neural networks. arXiv preprint arXiv:1711.01297, 2017.
  35. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the luna16 challenge. Medical image analysis, 42:1–13, 2017.
  36. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp. 2256–2265. PMLR, 2015.
  37. Denoising diffusion implicit models. arXiv:2010.02502, October 2020a. URL https://arxiv.org/abs/2010.02502.
  38. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  39. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  40. Consistency models. arXiv preprint arXiv:2303.01469, 2023.
  41. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
  42. Csdi: Conditional score-based diffusion models for probabilistic time series imputation. Advances in Neural Information Processing Systems, 34:24804–24816, 2021.
  43. Tipping, M. E. Bayesian inference: An introduction to principles and practice in machine learning. In Summer School on Machine Learning, pp.  41–62. Springer, 2003.
  44. A deeper look into aleatoric and epistemic uncertainty disentanglement. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.  1508–1516. IEEE, 2022.
  45. Continual learning with hypernetworks. In International Conference on Learning Representations, 2020. URL https://arxiv.org/abs/1906.00695.
  46. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  47. Practical and asymptotically exact conditional sampling in diffusion models. arXiv preprint arXiv:2306.17775, 2023.
  48. Advances in variational inference. IEEE transactions on pattern analysis and machine intelligence, 41(8):2008–2026, 2018.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com