Uncertainty Quantification and Propagation in Atomistic Machine Learning (2405.02461v3)
Abstract: Machine learning (ML) offers promising new approaches to tackle complex problems and has been increasingly adopted in chemical and materials sciences. Broadly speaking, ML models employ generic mathematical functions and attempt to learn essential physics and chemistry from a large amount of data. Consequently, because of the limited physical or chemical principles in the functional form, the reliability of the predictions is oftentimes not guaranteed, particularly for data far out of distribution. It is critical to quantify the uncertainty in model predictions and understand how the uncertainty propagates to downstream chemical and materials applications. Herein, we review existing uncertainty quantification (UQ) and uncertainty propagation (UP) methods for atomistic ML under a united framework of probabilistic modeling. We first categorize the UQ methods, with the aim to elucidate the similarities and differences between them. We also discuss performance metrics to evaluate the accuracy, precision, calibration, and efficiency of the UQ methods and techniques for model recalibration. With these metrics, we survey existing benchmark studies of the UQ methods using molecular and materials datasets. Furthermore, we discuss UP methods to propagate the uncertainty obtained from ML models in widely used materials and chemical simulation techniques, such as molecular dynamics and microkinetic modeling. We also provide remarks on the challenges and future opportunities of UQ and UP in atomistic ML.
- A review of uncertainty quantification in deep learning: Techniques, applications and challenges, Information fusion 76: 243–297.
- Propagating input uncertainties into parameter uncertainties and model prediction uncertainties—a review, Canadian Journal of Chemical Engineering 102(1): 254–273.
- Deep evidential regression, in H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan and H. Lin (eds), Advances in Neural Information Processing Systems, Vol. 33, pp. 14927–14937.
- Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework, J. Chem. Phys. 137(14): 144103.
- Arras, K. O. (1998). An introduction to error propagation: Derivation, meaning and examples of equation cy=fxcxfxtsubscript𝑐𝑦subscript𝑓𝑥subscript𝑐𝑥superscriptsubscript𝑓𝑥𝑡c_{y}=f_{x}c_{x}f_{x}^{t}italic_c start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, Technical Report EPFL-ASL-TR-98-01 R3, Swiss Federal Institute of Technology Lausanne (EPFL).
- Arrhenius, S. (1889a). Über die dissociationswärme und den einfluss der temperatur auf den dissociationsgrad der elektrolyte, Zeitschrift für physikalische Chemie 4(1): 96–116.
- Arrhenius, S. (1889b). Über die reaktionsgeschwindigkeit bei der inversion von rohrzucker durch säuren, Zeitschrift für physikalische Chemie 4(1): 226–248.
- Learning matter: Materials design with machine learning and atomistic simulations, Accounts of Materials Research 3(3): 343–357.
- Toward a design of active oxygen evolution catalysts: insights from automated density functional theory calculations and machine learning, Acs Catalysis 9(9): 7651–7659.
- Accurate prediction of protein structures and interactions using a three-track neural network, Science 373(6557): 871–876. http://dx.doi.org/10.1126/science.abj8754
- Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons, Physical review letters 104(13): 136403.
- A foundation model for atomistic materials chemistry. https://arxiv.org/abs/2401.00096
- Behler, J. (2021). Four generations of high-dimensional neural network potentials, Chemical Reviews 121(16): 10037–10072.
- Generalized neural-network representation of high-dimensional potential-energy surfaces, Physical review letters 98(14): 146401.
- Uncertainty-aware first-principles exploration of chemical reaction networks, arXiv preprint arXiv:2312.15477 .
- Berg, B. A. (2004). Markov chain Monte Carlo simulations and their statistical analysis: with web-based Fortran code, World Scientific Publishing Company.
- Berger, J. (1985). Statistical Decision Theory and Bayesian Analysis, Springer Series in Statistics, Springer.
- Bayesian Theory, Wiley Series in Probability and Statistics, Wiley.
- Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer-Verlag, Berlin, Heidelberg.
- Ceriotti, M. (2022). Beyond potentials: Integrated machine learning models for materials, Mrs Bulletin 47(10): 1045–1053.
- A universal graph deep learning interatomic potential for the periodic table, Nature Computational Science 2(11): 718–728.
- Machine learning on neutron and x-ray scattering and spectroscopies, Chemical Physics Reviews 2(3).
- Transitional markov chain monte carlo method for bayesian model updating, model class selection, and model averaging, Journal of engineering mechanics 133(7): 816–832.
- An improvement on the standard linear uncertainty quantification using a least-squares method, Journal of Uncertainty Analysis and Applications 3: 1–13.
- On the role of gradients for machine learning of molecular energies and forces, Machine Learning: Science and Technology 1(4): 045018.
- Elements of information theory, John Wiley & Sons.
- Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling, Nature Machine Intelligence 5(9): 1031–1041.
- Gaussian process regression for materials and molecules, Chemical Reviews 121(16): 10073–10141.
- Addressing global uncertainty and sensitivity in first-principles based microkinetic models by an adaptive sparse grid approach, The Journal of Chemical Physics 148(3).
- Path-space information bounds for uncertainty quantification and sensitivity analysis of stochastic dynamics, SIAM/ASA Journal on Uncertainty Quantification 4(1): 80–111.
- Extending machine learning beyond interatomic potentials for predicting molecular properties, Nature Reviews Chemistry 6(9): 653–672.
- Statistical inference under order restrictions. the theory and application of isotonic regression., Journal of the Royal Statistical Society. Series A (General) 137(1): 92. http://dx.doi.org/10.2307/2345150
- Gal, Y. (2016). Uncertainty in Deep Learning, PhD thesis, University of Cambridge.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning, international conference on machine learning, PMLR, pp. 1050–1059.
- Machine learning of solvent effects on molecular spectra and reactions, Chemical science 12(34): 11473–11483.
- A survey of uncertainty in deep neural networks, Artificial Intelligence Review 56: 1513–1589.
- Bayesian Data Analysis, Chapman and Hall/CRC. http://dx.doi.org/10.1201/b16018
- Probabilistic forecasts, calibration and sharpness, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69(2): 243–268.
- Strictly proper scoring rules, prediction, and estimation, Journal of the American statistical Association 102(477): 359–378.
- Clarifying trust of materials property predictions using neural networks with distribution-specific uncertainty quantification, Machine Learning: Science and Technology 4(2): 025019.
- Gull, S. F. (1989). Developments in Maximum Entropy Data Analysis, Springer Netherlands, p. 53–71.
- On calibration of modern neural networks, International conference on machine learning, PMLR, pp. 1321–1330.
- Evaluating scalable bayesian deep learning methods for robust computer vision, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 318–319.
- The elements of statistical learning: data mining, inference, and prediction, Vol. 2, Springer.
- Machine learning of reaction properties via learned representations of the condensed graph of reaction, Journal of Chemical Information and Modeling 62(9): 2101–2110.
- Characterizing uncertainty in machine learning for chemistry, Journal of Chemical Information and Modeling 63(13): 4012–4029.
- Uncertainty of exchange-correlation functionals in density functional theory calculations for lithium-based solid electrolytes on the case study of lithium phosphorus oxynitride, Journal of Computational Chemistry 42(18): 1283–1295.
- Uncertainty quantification using neural networks for molecular property prediction, Journal of Chemical Information and Modeling 60(8): 3770–3780.
- Uncertainty quantification and propagation in computational materials science and simulation-assisted materials design, Integrating Materials and Manufacturing Innovation 9(1): 103–143.
- Uncertainty quantification and propagation in calphad modeling, Modelling and Simulation in Materials Science and Engineering 27(3): 034003.
- Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods, Machine learning 110(3): 457–506.
- Highly accurate protein structure prediction with alphafold, Nature 596: 583–589.
- Imagenet classification with deep convolutional neural networks, in F. Pereira, C. Burges, L. Bottou and K. Weinberger (eds), Advances in Neural Information Processing Systems, Vol. 25, Curran Associates, Inc.
- Accurate uncertainties for deep learning using calibrated regression, International conference on machine learning, PMLR, pp. 2796–2804.
- On information and sufficiency, Ann. Math. Stat. 22(1): 79–86.
- Bayesian, frequentist, and information geometric approaches to parametric uncertainty quantification of classical empirical interatomic potentials, The Journal of Chemical Physics 156(21): 214103.
- Simple and scalable predictive uncertainty estimation using deep ensembles, Advances in neural information processing systems 30.
- Reproducibility in density functional theory calculations of solids, Science 351(6280).
- Evaluating and calibrating uncertainty prediction in regression tasks, Sensors 22(15): 5540.
- Bayesian chemical reaction neural network for autonomous kinetic uncertainty quantification, Physical Chemistry Chemical Physics 25(5): 3707–3717.
- MacKay, D. J. (1992a). Bayesian interpolation, Neural computation 4(3): 415–447.
- MacKay, D. J. (1992b). A practical bayesian framework for backpropagation networks, Neural computation 4(3): 448–472.
- Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments, Technometrics 33(2): 161–174.
- Microkinetic modeling: a tool for rational catalyst design, Chemical Reviews 121(2): 1049–1076.
- Neal, R. M. (1993). Probabilistic inference using markov chain monte carlo methods, Technical report, University of Toronto. https://glizen.com/radfordneal/ftp/review.pdf
- Neal, R. M. (2003). Slice sampling, The annals of statistics 31(3): 705–767.
- Estimating the mean and variance of the target probability distribution, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), IEEE. http://dx.doi.org/10.1109/ICNN.1994.374138
- Straintensornet: Predicting crystal structure elastic properties using se (3)-equivariant graph neural networks, Physical Review Research 5(4): 043198.
- A relative entropy rate method for path space sensitivity analysis of stationary complex stochastic dynamics, The Journal of chemical physics 138(5).
- Addressing uncertainty in atomistic machine learning, Physical Chemistry Chemical Physics 19(18): 10978–10985.
- Incorporation of parametric uncertainty into complex kinetic mechanisms: Application to hydrogen oxidation in supercritical water, Combustion and Flame 112(1): 132–146.
- Large sample confidence regions based on subsamples under minimal assumptions, The Annals of Statistics pp. 2031–2050.
- Weak convergence of dependent empirical measures with application to subsampling in function spaces, Journal of statistical planning and inference 79(2): 179–190.
- Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons, Journal of Computational Physics 477: 111902.
- Machine learning–enabled high-entropy alloy discovery, Science 378(6615): 78–85. http://dx.doi.org/10.1126/science.abo4940
- Rasmussen, C. E. (2003). Gaussian processes in machine learning, Summer school on machine learning, Springer, pp. 63–71.
- Quantifying uncertainty in chemical systems modeling, International Journal of Chemical Kinetics 37(6): 368–382.
- Exchange coupling in transition-metal complexes via density-functional theory: Comparison and reliability of different basis set approaches, Journal of Chemical Physics 123(7).
- Evaluating scalable uncertainty estimation methods for deep learning-based molecular property prediction, Journal of chemical information and modeling 60(6): 2697–2717.
- Schienbein, P. (2023). Spectroscopy from machine learning by accurately representing the atomic polar tensor, Journal of Chemical Theory and Computation 19(3): 705–712.
- On the pitfalls of heteroscedastic uncertainty estimation with probabilistic neural networks, International Conference on Learning Representations.
- Selten, R. (1998). Axiomatic characterization of the quadratic scoring rule, Experimental Economics 1: 43–61.
- Sobol, I. M. (2001). Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates, Mathematics and computers in simulation 55(1-3): 271–280.
- Evidential deep learning for guided molecular property prediction and discovery, ACS Central Science 7(8): 1356–1367.
- Dropout: a simple way to prevent neural networks from overfitting, The journal of machine learning research 15(1): 1929–1958.
- Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost, Nature Communications 11(2328): 1–12.
- Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles, npj Computational Materials 9(1). http://dx.doi.org/10.1038/s41524-023-01180-8
- Uncertainty prediction for machine learning models of material properties, ACS Omega 6(48): 32431–32440.
- Methods for comparing uncertainty quantifications for material property predictions, Machine Learning: Science and Technology 1(2): 025006.
- Parametric sensitivity analysis for stochastic molecular systems using information theoretic metrics, The Journal of chemical physics 143(1).
- Physnet: A neural network for predicting energies, forces, dipole moments, and partial charges, Journal of chemical theory and computation 15(6): 3678–3693.
- Materials property prediction with uncertainty quantification: A benchmark study, Applied Physics Reviews 10(2).
- Application of the polynomial chaos expansion to the simulation of chemical reactors with uncertainties, Mathematics and Computers in Simulation 82(5): 805–817.
- Ltau-ff: Loss trajectory analysis for uncertainty in atomistic force fields, arXiv preprint arXiv:2402.00853 .
- Wahba, G. (1985). A comparison of gcv and gml for choosing the smoothing parameter in the generalized spline smoothing problem, The annals of statistics pp. 1378–1402.
- Combustion kinetic model uncertainty quantification, propagation and minimization, Progress in Energy and Combustion Science 47: 1–31.
- Density functionals for surface science: Exchange-correlation model development with bayesian error estimation, Physical Review B 85(23): 235149.
- Wen, M. (2019). Development of interatomic potentials with uncertainty quantification: applications to two-dimensional materials, PhD thesis, University of Minnesota. https://hdl.handle.net/11299/206694
- Bondnet: a graph neural network for the prediction of bond dissociation energies for charged molecules, Chemical science 12(5): 1858–1868.
- Improving machine learning performance on small chemical reaction data with unsupervised contrastive pretraining, Chemical Science 13(5): 1446–1458.
- An equivariant graph neural network for the elasticity tensors of all seven crystal systems, Digital Discovery . http://dx.doi.org/10.1039/D3DD00233K
- A force-matching stillinger-weber potential for mos2: Parameterization and fisher information theory based sensitivity analysis, Journal of Applied Physics 122(24).
- Chemical reaction networks and opportunities for machine learning, Nature Computational Science 3(1): 12–24.
- Uncertainty quantification in molecular simulations with dropout neural network potentials, npj computational materials 6(1): 124.
- Claisen’sche umlagerungen bei allyl-und benzylalkoholen mit hilfe von acetalen des n, n-dimethylacetamids. vorläufige mitteilung, Helvetica Chimica Acta 47(8): 2425–2429.
- Wiener, N. (1938). The homogeneous chaos, American Journal of Mathematics 60(4): 897–936.
- Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning, Science 363(6424). http://dx.doi.org/10.1126/science.aau5631
- Zhou, Z.-H. (2012). Ensemble Methods: Foundations and Algorithms, 1st edn, Chapman & Hall/CRC.
- Fast uncertainty estimates in deep learning interatomic potentials, The Journal of Chemical Physics 158(16).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.