Uncertainty Quantification Using Neural Networks for Molecular Property Prediction (2005.10036v1)

Published 20 May 2020 in cs.LG, q-bio.QM, and stat.ML

Abstract: Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While several approaches to UQ have been proposed in the literature, there is no clear consensus on the comparative performance of these models. In this paper, we study this question in the context of regression tasks. We systematically evaluate several methods on five benchmark datasets using multiple complementary performance metrics. Our experiments show that none of the methods we tested is unequivocally superior to all others, and none produces a particularly reliable ranking of errors across multiple datasets. While we believe these results show that existing UQ methods are not sufficient for all common use-cases and demonstrate the benefits of further research, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.

Authors (5)

Lior Hirschfeld (3 papers)
Kyle Swanson (9 papers)
Kevin Yang (45 papers)
Regina Barzilay (106 papers)
Connor W. Coley (59 papers)

Citations (174)

View on Semantic Scholar

Summary

Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

The paper "Uncertainty Quantification using Neural Networks for Molecular Property Prediction" addresses a critical aspect of ML applications in drug discovery, namely uncertainty quantification (UQ). UQ is particularly vital in molecular property prediction because model predictions often direct experimental designs. This is crucial for optimizing resource allocation and preventing costly errors arising from imprecise predictions. Despite the utility of neural networks (NNs) in QSAR modeling, their opaque nature makes interpreting predictions and assessing robustness challenging. The authors undertake a systematic evaluation of various UQ techniques on regression tasks utilizing multiple datasets and performance metrics, revealing limitations in existing UQ methods and suggesting avenues for future research.

Evaluation of UQ Methods

The paper evaluates UQ methods across five regression datasets using metrics such as Spearman's rank correlation, miscalibration area, negative log likelihood (NLL), and calibrated NLL (cNLL). Additionally, they compare methods' performance across these datasets to derive general conclusions regarding their reliability and applicability.

Findings and Analysis

The paper finds substantial variability in the performance of UQ methods across different datasets. No method emerges as universally superior, highlighting the lack of a one-size-fits-all solution for quantifying uncertainty in molecular property predictions using NNs. While message passing networks (MPNNs) generally exhibited greater accuracy than feedforward networks (FFNs), similar reductions in RMSE were noted across models when applying effective UQ methods. Notably, methods such as MPNN RF and MPNN GP demonstrated consistent efficacy across multiple metrics like miscalibration area and NLL, establishing themselves as robust options for UQ in these contexts.

Implications and Future Directions

The practical implication of these findings is that researchers must exercise caution when selecting UQ methods for specific tasks, recognizing that performance is highly dataset-dependent. Theoretical implications underscore the necessity for developing new UQ methods that can provide reliable estimates across diverse domains, potentially through techniques like model stacking or enhanced ensemble frameworks.

Looking forward, the paper suggests investigating the application of emerging UQ strategies, scaling scalable approximation methods to larger datasets, and exploring UQ metrics for classification tasks in drug discovery and related fields. Bridging the gap between theoretical robustness and practical applicability remains a key challenge in advancing dependable molecular property prediction models.

PDF Markdown

Related Papers

Find Related Papers