Multi-Objective De Novo Drug Design with Conditional Graph Generative Model (1801.07299v3)

Published 18 Jan 2018 in q-bio.QM and cs.LG

Abstract: Recently, deep generative models have revealed itself as a promising way of performing de novo molecule design. However, previous research has focused mainly on generating SMILES strings instead of molecular graphs. Although current graph generative models are available, they are often too general and computationally expensive, which restricts their application to molecules with small sizes. In this work, a new de novo molecular design framework is proposed based on a type sequential graph generators that do not use atom level recurrent units. Compared with previous graph generative models, the proposed method is much more tuned for molecule generation and have been scaled up to cover significantly larger molecules in the ChEMBL database. It is shown that the graph-based model outperforms SMILES based models in a variety of metrics, especially in the rate of valid outputs. For the application of drug design tasks, conditional graph generative model is employed. This method offers higher flexibility compared to previous fine-tuning based approach and is suitable for generation based on multiple objectives. This approach is applied to solve several drug design problems, including the generation of compounds containing a given scaffold, generation of compounds with specific drug-likeness and synthetic accessibility requirements, as well as generating dual inhibitors against JNK3 and GSK3$\beta$. Results show high enrichment rates for outputs satisfying the given requirements.

PDF Abstract

Multi-Objective De Novo Drug Design with Conditional Graph Generative Model

The paper "Multi-Objective De Novo Drug Design with Conditional Graph Generative Model" presents a sophisticated approach to the automated de novo design of drug molecules. This work shifts from the traditional SMILES-based molecule generation methods to a more direct representation using graph generative models, which offer an improved platform for simulating molecular structures more accurately, especially for molecules of considerable size as suggested by datasets like ChEMBL.

The research carried out by Yibo Li, Liangren Zhang, and Zhenming Liu introduces a graph generative model specifically tailored for the creation of molecular structures. Unlike conventional methods that often rely on SMILES representations susceptible to encoding errors unrelated to chemical structure, this paper employs graph representations where atoms and bonds are depicted as nodes and edges, respectively. Through this framework, the authors propose a more streamlined decoding process that significantly enhances the output validity and utility in drug design.

Key Contributions

Graph-Based Molecular Design: The authors propose a novel graph-based generative model that efficiently handles the combinatorial complexity of molecular design. It mitigates issues related to the SMILES format by naturally incorporating chemical rules within the graph structure.
Conditional Graph Generative Model: A noteworthy advancement in this paper is the introduction of a conditional model that can be trained to produce molecules meeting specific predefined criteria, such as containing certain molecular scaffolds and achieving particular drug-likeness or synthetic accessibility scores.
Optimization and Scalability: The graph model does away with recurrent units at the atom-level, opting instead for a Markovian or molecule-level recurrence representation. This choice improves scalability, allowing the model to process larger molecules efficiently.
Multi-Objective Target Design: The authors leverage this model to tackle complex drug design scenarios, such as the design of dual inhibitors targeting JNK3 and GSK3β. They achieve promising enrichment rates, showcasing the potential of this method in generating focused molecular libraries.
Evaluation Metrics and Results: The paper reports superior performance of the graph-based methods in producing valid molecular outputs compared to SMILES-based approaches. For example, the MolRNN model demonstrated a significantly higher rate of valid outputs, achieving a noteworthy 97% validity rate. Furthermore, the model adeptly generates output distributions aligning closely with targeted molecular property distributions in the test set, as evidenced by lower KL and JS divergence values.

Implications and Future Prospects

The conditional graph generative model delineated in this paper implies substantial progress in computational drug design, offering higher flexibility and capability to handle multi-objective criteria without requiring extensive fine-tuning. It lays a foundation for future studies to explore drug design tasks that necessitate high specificity and adaptability in molecular structures, such as allosteric modulators or selective kinase inhibitors.

Moreover, this paper underscores the potential of applying machine learning techniques in domain-specific contexts like pharmaceutical development, as these models can provide immediate utility in identifying and optimizing drug candidates. Potential research trajectories include refining these models to incorporate stereochemistry and extending them to the realms of biopharmaceuticals and complex natural products.

Overall, this paper exemplifies a significant step forward in molecular informatics, merging deep learning with the nuanced requirements of drug discovery, and foreshadows wider applications in personalized medicine and automated drug synthesis.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Yibo Li (17 papers)
Liangren Zhang (6 papers)
Zhenming Liu (30 papers)

Citations (320)

View on Semantic Scholar

Multi-Objective De Novo Drug Design with Conditional Graph Generative Model (1801.07299v3)