One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning (2408.11356v1)

Published 21 Aug 2024 in cs.AI, cs.LG, and q-bio.BM

Abstract: Understanding the structure of the protein-ligand complex is crucial to drug development. Existing virtual structure measurement and screening methods are dominated by docking and its derived methods combined with deep learning. However, the sampling and scoring methodology have largely restricted the accuracy and efficiency. Here, we show that these two fundamental tasks can be accurately tackled with a single model, namely LigPose, based on multi-task geometric deep learning. By representing the ligand and the protein pair as a graph, LigPose directly optimizes the three-dimensional structure of the complex, with the learning of binding strength and atomic interactions as auxiliary tasks, enabling its one-step prediction ability without docking tools. Extensive experiments show LigPose achieved state-of-the-art performance on major tasks in drug research. Its considerable improvements indicate a promising paradigm of AI-based pipeline for drug development.

Summary

The paper introduces LigPose, a novel unified deep learning model that predicts protein-ligand structures and binding profiles in one step.
It uses a graph transformer and multi-task learning to update features and coordinates, achieving up to 1851 times faster inference compared to traditional docking.
Benchmarks on PDBbind, CASF-2016, and SARS-CoV-2 datasets show significant improvements in structure prediction, cross-docking, and screening power.

Overview of One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

The paper "One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning" presents a novel approach named LigPose which addresses the structure prediction and screening of protein-ligand complexes using a unified deep learning framework. The importance of this research lies in its ability to streamline and enhance the precision of drug development processes by accurately predicting the three-dimensional structures of protein-ligand complexes, which are critical for understanding drug efficacy and bioactivity.

Methodology and Architecture

LigPose optimizes the 3D structure of the protein-ligand complex by representing the ligand and protein as a graph, bypassing traditional docking tools. This method integrates multi-task learning to simultaneously predict structure, binding affinity, and screening probability. LigPose employs three main innovations:

Graph Representation and Optimization:
- A complete undirected graph is constructed where nodes represent atoms and edges represent connections.
- The method uses a sampling and recycling strategy to manage the computational burden by processing only a subset of atoms at each step.
Feature and Coordinate Update Blocks:
- Feature Update Block: Utilizes a graph transformer to aggregate and update node and edge features based on multi-head attention mechanisms.
- Coordinate Update Block: Updates atom coordinates in 3D space ensuring SE(3)-equivalence by leveraging inter-atomic distances during network forwarding.
Multi-task Learning Framework:
- Predicts the 3D coordinates, binding affinity, and binding probability simultaneously.
- Self-supervised learning on large-scale unlabeled data enhances the model’s generalizability.

Performance Evaluation

The performance of LigPose was compared with 12 traditional docking tools and several recent deep learning methods across various datasets, including the PDBbind refined and core sets, PDBbind-CrossDocked-Core, and CASF-2016. Key performance metrics included:

Structure Prediction: Achieved a success rate of 74.1% on the PDBbind refined set, outperforming traditional docking tools by a significant margin.
Cross-Docking: Demonstrated superior performance with a substantial improvement in success rate (20.1%) over the best-performing method on more practical and challenging cross-docking tasks.
Screening Power: Exhibited a higher success rate and enhancement factors for both forward and reverse screening tasks on the CASF-2016 benchmark.

Efficiency and Robustness

LigPose not only enhances the accuracy but also substantially improves the efficiency of structure prediction and virtual screening. For instance:

Inference Speed: LigPose is shown to perform up to 1851 times faster than traditional docking tools, which is crucial for high-throughput drug screening.
Handling Ligand Flexibility: Exhibits robust performance on ligands with various flexibilities, maintaining high accuracy even for ligands with many rotatable bonds.

Applicability in Real-world Scenarios

The applicability of LigPose was further validated on real-world datasets, specifically focusing on SARS-CoV-2 main protease (Mpro) structures and screening inhibitors:

Structure Prediction for SARS-CoV-2 Mpro: Achieved a significant improvement in success rate over traditional docking tools on a benchmark dataset.
Inhibitor Screening: Demonstrated notable accuracy improvements in identifying potential SARS-CoV-2 Mpro inhibitors, underscoring its robustness in real-world drug discovery contexts.

Interpretability and Future Directions

One highlight of LigPose is its interpretability, particularly in reconstructing non-covalent interactions which are crucial in drug design. By implicitly learning these interactions, LigPose provides insight into atomic correlations without explicit physical or chemical priors.

Implications and Future Prospects

The implications of this research are profound for both theoretical and practical aspects of drug development:

Theoretically, LigPose suggests a paradigm shift in the way protein-ligand complex structures are predicted, moving away from traditional docking-based methods to an integrated deep learning approach.
Practically, its high accuracy and efficiency could significantly accelerate drug discovery processes, making it possible to screen vast libraries of compounds effectively and efficiently.

Future developments could focus on integrating LigPose with other AI-driven protein structure prediction tools like AlphaFold for a comprehensive end-to-end drug discovery pipeline. Additionally, exploring the model’s application in de novo drug design opens another promising avenue for research.

In sum, LigPose presents a significant advancement in the computational drug development field, offering a robust, efficient, and highly accurate method for predicting and screening protein-ligand complexes.

PDF Markdown

Related Papers

Tweets

https://twitter.com/rkakamilan/status/1826875671081336960