Deep Reinforcement Learning for De-Novo Drug Design (1711.10907v2)

Published 29 Nov 2017 in cs.AI, cs.LG, and stat.ML

Abstract: We propose a novel computational strategy for de novo design of molecules with desired properties termed ReLeaSE (Reinforcement Learning for Structural Evolution). Based on deep and reinforcement learning approaches, ReLeaSE integrates two deep neural networks - generative and predictive - that are trained separately but employed jointly to generate novel targeted chemical libraries. ReLeaSE employs simple representation of molecules by their SMILES strings only. Generative models are trained with stack-augmented memory network to produce chemically feasible SMILES strings, and predictive models are derived to forecast the desired properties of the de novo generated compounds. In the first phase of the method, generative and predictive models are trained separately with a supervised learning algorithm. In the second phase, both models are trained jointly with the reinforcement learning approach to bias the generation of new chemical structures towards those with the desired physical and/or biological properties. In the proof-of-concept study, we have employed the ReLeaSE method to design chemical libraries with a bias toward structural complexity or biased toward compounds with either maximal, minimal, or specific range of physical properties such as melting point or hydrophobicity, as well as to develop novel putative inhibitors of JAK2. The approach proposed herein can find a general use for generating targeted chemical libraries of novel compounds optimized for either a single desired property or multiple properties.

Citations (946)

View on Semantic Scholar

Summary

The paper presents the ReLeaSE framework which integrates generative and predictive deep neural networks with reinforcement learning to optimize molecule generation.
It achieves a 95% success rate in producing chemically valid and novel SMILES strings using a stack-augmented memory network.
The predictive model reliably forecasts key chemical properties, matching state-of-the-art QSAR performance and ensuring synthetic accessibility for drug discovery.

Deep Reinforcement Learning for de-novo Drug Design

Introduction

The paper under discussion introduces a computational framework named ReLeaSE (Reinforcement Learning for Structural Evolution), aimed at the innovative de-novo design of molecules with desired properties. This method amalgamates deep learning with reinforcement learning (RL) to generate novel chemical libraries. ReLeaSE employs two deep neural networks—generative and predictive—that are initially trained separately before being integrated into a joint reinforcement learning paradigm. The framework's utility is established through proof-of-concept studies targeting various chemical and biological properties.

Methodology

The ReLeaSE framework operates on the simplistic yet effective representation of molecules through their SMILES strings. The generative model, a stack-augmented memory network, learns to produce chemically feasible SMILES strings. The predictive model forecasts the desired properties of these generated compounds.

The workflow involves two main training phases:

Supervised Learning Phase: Separate training of generative and predictive models using supervised learning techniques.
Reinforcement Learning Phase: Joint training to optimize the generation of new chemical structures with desired physical, chemical, and bioactivity properties.

The reinforcement learning setup casts the generative network as an agent producing actions (SMILES strings) under the evaluation and feedback from the predictive model, which acts as a critic. Rewards are assigned based on the properties predicted by the predictive model, and the generative model is trained to maximize these rewards.

Results

Validity and Novelty of Generated Molecules

Upon testing, the generative model demonstrated a 95% success rate in producing chemically valid molecules. Comparatively, a significant drop to 86% validity rate was observed when a non-stack memory network was utilized. An analysis showed that the stack memory facilitation increased internal diversity and reduced the production of duplicate structures, thus bolstering the novelty—a key aspect of de-novo drug design.

Structural Complexity and Synthetic Accessibility

The research included experiments designed to generate libraries biased toward different target properties, such as melting point, hydrophobicity (LogP), and bioactivity against JAK2. Remarkably, the RL-optimized generative models managed to shift the mean values of these properties markedly from the training set distributions, confirming the method's efficacy.

The paper also investigated the structural complexity and synthetic feasibility of the generated molecules. The Synthetic Accessibility Score (SAS) analysis displayed that most generated molecules were within acceptable synthetic complexity limits (SAS < 6), validating the method's practicality for drug discovery.

QSAR Modeling

The predictive component of the ReLeaSE method, analogous to traditional Quantitative Structure-Activity Relationships (QSAR), leverages SMILES strings directly, bypassing the descriptor generation phase. The models showed impressive predictive accuracy with an external R² of 0.91 for LogP and root mean squared error (RMSE) congruent with state-of-the-art methods for melting temperature predictions.

Visualizing Chemical Space

Dimensionality reduction using t-SNE was employed to visualize the chemical space populated by the generative model. For endpoints such as LogP, distinct clustering was observed, indicating effective property-based segregation in the generated libraries.

Discussion

ReLeaSE stands out by integrating generative and predictive modeling into a unified RL-driven workflow. The avoidance of predefined chemical descriptors simplifies the QSAR modeling process and improves efficiency. Although similar RL-based molecular design methods exist, ReLeaSE's stack-augmented recurrent neural network distinguishes it from traditional RNNs by addressing issues of algorithmic pattern inference and sequence dependencies, crucial for valid SMILES generation.

The method's dual capacity to optimize for numerous individual properties and provide reliable synthetic accessibility assessments substantiates its potential for real-world application in early drug discovery. Future work will explore multi-objective optimizations, aligning drug-like properties with efficacy, selectivity, and ADMET considerations.

Implications and Future Work

The ReLeaSE method propels the field of computational drug design by introducing a robust, integrated approach for generating structurally diverse and property-optimized molecules. It addresses some of the principal challenges in the area, such as novelty generation, synthetic feasibility, and direct SMILES-based property prediction. Future research will extend ReLeaSE's capabilities to multi-objective optimization to cater to more complex requirements in drug development, potentially transforming initial hit discovery and lead optimization stages.

In conclusion, ReLeaSE provides a promising framework for computational molecular design, leveraging the latest in deep reinforcement learning to streamline and enhance the generation of targeted chemical libraries with bespoke properties.

PDF Markdown