Low Data Drug Discovery with One-shot Learning (1611.03199v1)

Published 10 Nov 2016 in cs.LG and stat.ML

Abstract: Recent advances in machine learning have made significant contributions to drug discovery. Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds. However, the applicability of these techniques has been limited by the requirement for large amounts of training data. In this work, we demonstrate how one-shot learning can be used to significantly lower the amounts of data required to make meaningful predictions in drug discovery applications. We introduce a new architecture, the residual LSTM embedding, that, when combined with graph convolutional neural networks, significantly improves the ability to learn meaningful distance metrics over small-molecules. We open source all models introduced in this work as part of DeepChem, an open-source framework for deep-learning in drug discovery.

Citations (651)

View on Semantic Scholar

Summary

The paper presents a one-shot learning framework utilizing a residual LSTM to drastically reduce data needs in drug discovery.
It integrates graph convolutional networks to generate context-aware embeddings, achieving superior prediction accuracy on the Tox21 dataset.
Experimental results demonstrate the approach's potential and outline challenges, inspiring further research into cross-task generalization.

Low Data Drug Discovery with One-shot Learning: An Overview

The paper "Low Data Drug Discovery with One-shot Learning" presents a notable contribution to computational drug discovery through the application of one-shot learning techniques. Authored by Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, and Vijay Pande, the work addresses the challenge of making meaningful predictions in drug discovery when data is limited.

Background and Motivation

Drug discovery, particularly the lead optimization stage, is often hindered by a paucity of data. Traditional machine learning approaches in this domain typically demand extensive datasets to inform predictions about molecular properties and biological activities. However, collecting such data can be resource-intensive and time-consuming. This paper innovates by leveraging one-shot learning, aiming to significantly reduce data requirements while maintaining predictive accuracy.

Methodology

The research introduces a new architecture, referred to as the residual LSTM embedding, designed for effective one-shot learning. This architecture, in combination with graph convolutional neural networks, enhances the modeling of small-molecule properties by effectively learning distance metrics in a low-data environment. The authors emphasize the construction of context-aware embeddings through iterative refinement using dual residual LSTMs, thereby addressing the shortcomings of context independence in previous models.

Experimental Results

The paper evaluates the proposed methods on several datasets, including Tox21, SIDER, and MUV, employing a variety of training and testing splits to assess model performance.

Tox21 Dataset: One-shot learning methods, particularly the residual LSTM model, demonstrated superior accuracy over random-forest baselines, even when the training data was minimal. For instance, with just one positive and one negative example, the residual LSTM achieved an accuracy of 0.784 compared to the random forest's 0.542.
SIDER Dataset: Similar improvements were observed, with the residual LSTM attaining an accuracy of 0.623 in the most data-constrained setting.
MUV Dataset: While the performance boost was less pronounced, the residual LSTM showed a competitive edge, highlighting the model's limitations in scenarios with structurally diverse compounds where traditional methods performed better.

Implications

The reported results indicate that one-shot learning, particularly with advancements in neural architectures like the residual LSTM, holds potential for transforming computational approaches to drug discovery. The paper's open-source approach, via the DeepChem library, encourages reproducibility and further exploration in the field.

However, limitations are acknowledged. The transfer learning experiments, where models trained on the Tox21 dataset were evaluated on the SIDER dataset, failed to achieve meaningful results, reflecting the need for further research into cross-task generalization capabilities.

Future Directions

Given the promising results and existing limitations, future research is likely to explore several avenues:

The development of improved architectures that can generalize across diverse molecular scaffolds.
Enhanced methods for incorporating external biological knowledge into model training.
Experimental validation to corroborate computational predictions and refine models.

Through this research, the intersection of machine learning and drug discovery continues to evolve, offering new tools for addressing complex pharmaceutical challenges with less data dependency.

PDF Markdown