Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search (2307.10438v3)

Published 19 Jul 2023 in cs.LG, physics.chem-ph, and q-bio.BM

Abstract: Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven methods for molecular property prediction. However, a key limitation of typical GNN models is their inability to quantify uncertainties in the predictions. This capability is crucial for ensuring the trustworthy use and deployment of models in downstream tasks. To that end, we introduce AutoGNNUQ, an automated uncertainty quantification (UQ) approach for molecular property prediction. AutoGNNUQ leverages architecture search to generate an ensemble of high-performing GNNs, enabling the estimation of predictive uncertainties. Our approach employs variance decomposition to separate data (aleatoric) and model (epistemic) uncertainties, providing valuable insights for reducing them. In our computational experiments, we demonstrate that AutoGNNUQ outperforms existing UQ methods in terms of both prediction accuracy and UQ performance on multiple benchmark datasets. Additionally, we utilize t-SNE visualization to explore correlations between molecular features and uncertainty, offering insight for dataset improvement. AutoGNNUQ has broad applicability in domains such as drug discovery and materials science, where accurate uncertainty quantification is crucial for decision-making.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Tensorflow: a system for large-scale machine learning. In Osdi, volume 16, pages 265–283. Savannah, GA, USA, 2016.
  2. Deephyper: Asynchronous hyperparameter search for deep neural networks. In 2018 IEEE 25th international conference on high performance computing (HiPC), pages 42–51. IEEE, 2018.
  3. 970 million druglike small molecules for virtual screening in the chemical universe database gdb-13. Journal of the American Chemical Society, 131(25):8732–8733, 2009.
  4. Calibrated uncertainty for molecular property prediction using ensembles of message passing neural networks. Machine Learning: Science and Technology, 3(1):015012, 2021.
  5. Qsar modeling: where have you been? where are you going to? Journal of medicinal chemistry, 57(12):4977–5010, 2014.
  6. Uncertainty toolbox: an open-source library for assessing, visualizing, and improving uncertainty quantification. arXiv preprint arXiv:2109.10254, 2021.
  7. John S Delaney. Esol: estimating aqueous solubility directly from molecular structure. Journal of chemical information and computer sciences, 44(3):1000–1005, 2004.
  8. Reoptimization of mdl keys for use in drug discovery. Journal of chemical information and computer sciences, 42(6):1273–1280, 2002.
  9. Autodeuq: Automated deep ensemble with uncertainty quantification. arXiv preprint arXiv:2110.13511, 2021.
  10. Potentialnet for molecular property prediction. ACS central science, 4(11):1520–1530, 2018.
  11. Bayesian convolutional neural networks with bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158, 2015.
  12. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR, 2016.
  13. Concrete dropout. In Neural Information Processing Systems, 2017.
  14. A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342, 2021.
  15. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR, 2017.
  16. Graph neural networks in tensorflow and keras with spektral [application notes]. IEEE Computational Intelligence Magazine, 16(1):99–106, 2021.
  17. A kronecker-factored approximate fisher matrix for convolution layers. In International Conference on Machine Learning, pages 573–582. PMLR, 2016.
  18. Evaluating scalable bayesian deep learning methods for robust computer vision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 318–319, 2020.
  19. Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence, 12(10):993–1001, 1990.
  20. Asgn: An active semi-supervised graph neural network for molecular property prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 731–752, 2020.
  21. Ensembles of networks produced from neural architecture search. In High Performance Computing: ISC High Performance 2020 International Workshops, Frankfurt, Germany, June 21–25, 2020, Revised Selected Papers 35, pages 223–234. Springer, 2020.
  22. Uncertainty quantification using neural networks for molecular property prediction. Journal of Chemical Information and Modeling, 60(8):3770–3780, 2020.
  23. Monte carlo dropout based batchensemble for improving uncertainty estimation. In Proceedings of the 6th Joint International Conference on Data Science & Management of Data (10th ACM IKDD CODS and 28th COMAD), pages 138–138, 2023.
  24. A quantitative uncertainty metric controls error in neural network-driven chemical discovery. Chemical science, 10(34):7913–7922, 2019.
  25. Graph neural network architecture search for molecular property prediction. In 2020 IEEE International conference on big data (big data), pages 1346–1353. IEEE, 2020.
  26. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  27. Accurate uncertainties for deep learning using calibrated regression. In International conference on machine learning, pages 2796–2804. PMLR, 2018.
  28. Simple and scalable predictive uncertainty estimation using deep ensembles. 2016.
  29. Recalibration of aleatoric and epistemic regression uncertainty in medical imaging. arXiv preprint arXiv:2104.12376, 2021.
  30. Estimating model uncertainty of neural networks in sparse information form. In International Conference on Machine Learning, pages 5702–5713. PMLR, 2020.
  31. On ensemble techniques of weight-constrained neural networks. Evolving Systems, 12:155–167, 2021.
  32. An integrative machine learning approach for prediction of toxicity-related drug safety. Life science alliance, 1(6):1–14, 2018.
  33. Recurrent neural network architecture search for geophysical emulation. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. IEEE, 2020.
  34. Chembl: towards direct deposition of bioassay data. Nucleic acids research, 47(D1):D930–D940, 2019.
  35. Freesolv: a database of experimental and calculated hydration free energies, with input files. Journal of computer-aided molecular design, 28:711–720, 2014.
  36. Machine learning of molecular electronic properties in chemical compound space. New Journal of Physics, 15(9):095003, 2013.
  37. Estimating the mean and variance of the target probability distribution. In Proceedings of 1994 ieee international conference on neural networks (ICNN’94), volume 1, pages 55–60. IEEE, 1994.
  38. Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
  39. Pascal Pernot. Prediction uncertainty validation for computational chemists. The Journal of chemical physics, 157 14:144103, 2022.
  40. Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons. Journal of Computational Physics, page 111902, 2023.
  41. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019.
  42. A scalable laplace approximation for neural networks. In 6th International Conference on Learning Representations, ICLR 2018-Conference Track Proceedings, volume 6. International Conference on Representation Learning, 2018.
  43. Methods for comparing uncertainty quantifications for material property predictions. Machine Learning: Science and Technology, 1(2):025006, 2020.
  44. Laurens van der Maaten and Geoffrey E. Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9:2579–2605, 2008.
  45. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  46. Applications of deep learning in molecule generation and molecular property prediction. Accounts of chemical research, 54(2):263–270, 2020.
  47. Moleculenet: a benchmark for molecular machine learning. Chemical science, 9(2):513–530, 2018.
  48. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
  49. Explainable uncertainty quantifications for deep learning-based molecular property prediction. Journal of Cheminformatics, 15(1):13, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shengli Jiang (5 papers)
  2. Shiyi Qin (4 papers)
  3. Reid C. Van Lehn (1 paper)
  4. Prasanna Balaprakash (92 papers)
  5. Victor M. Zavala (167 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com