HAC-Net: A Hybrid Attention-Based Convolutional Neural Network for Highly Accurate Protein-Ligand Binding Affinity Prediction (2212.12440v4)

Published 23 Dec 2022 in q-bio.BM and cs.LG

Abstract: Applying deep learning concepts from image detection and graph theory has greatly advanced protein-ligand binding affinity prediction, a challenge with enormous ramifications for both drug discovery and protein engineering. We build upon these advances by designing a novel deep learning architecture consisting of a 3-dimensional convolutional neural network utilizing channel-wise attention and two graph convolutional networks utilizing attention-based aggregation of node features. HAC-Net (Hybrid Attention-Based Convolutional Neural Network) obtains state-of-the-art results on the PDBbind v.2016 core set, the most widely recognized benchmark in the field. We extensively assess the generalizability of our model using multiple train-test splits, each of which maximizes differences between either protein structures, protein sequences, or ligand extended-connectivity fingerprints of complexes in the training and test sets. Furthermore, we perform 10-fold cross-validation with a similarity cutoff between SMILES strings of ligands in the training and test sets, and also evaluate the performance of HAC-Net on lower-quality data. We envision that this model can be extended to a broad range of supervised learning problems related to structure-based biomolecular property prediction. All of our software is available as open source at https://github.com/gregory-kyro/HAC-Net/, and the HACNet Python package is available through PyPI.

Citations (27)

View on Semantic Scholar

Summary

The paper presents HAC-Net, a novel hybrid model that fuses 3D CNNs with channel-wise and node-wise attention to predict protein-ligand binding affinity.
It uses squeeze-and-excitation blocks in its CNN and dual attention-based GCNs, achieving an RMSE of 1.205 on the PDBbind v.2016 core set.
The model's robust performance across diverse datasets and open-source implementation highlight its potential to advance drug discovery and protein engineering.

Overview of HAC-Net: Hybrid Attention-Based Convolutional Neural Network for Protein-Ligand Binding Affinity Prediction

The paper "HAC-Net: A Hybrid Attention-Based Convolutional Neural Network for Highly Accurate Protein-Ligand Binding Affinity Prediction" presents a novel deep learning architecture aimed at improving the prediction of protein-ligand binding affinity, a critical factor in drug discovery and protein engineering. The paper leverages the advancements in deep learning, particularly from image detection via convolutional neural networks (CNNs) and graph theory via graph convolutional networks (GCNs), to enhance predictive performance in this domain.

Model Architecture and Innovations

The proposed architecture, HAC-Net, combines a 3-dimensional CNN with channel-wise attention alongside two GCNs employing attention-based aggregation of node features. This hybrid approach is designed to capitalize on the strengths of both CNNs and GCNs, achieving a balance between the complementary learning styles inherent in these two frameworks. The CNN component of HAC-Net represents protein-ligand complexes as voxel grids, which facilitates learning through spatial convolution transformed into feature maps via attention mechanisms. The GCNs, in turn, portray these complexes as graphs, where attention mechanisms aid in node feature aggregation, emphasizing relevant atomic interactions.

HAC-Net introduces the squeeze-and-excitation (SE) blocks into the CNN to incorporate channel-wise attention, enhancing the feature learning by recalibrating channel weights based on their significance. The GCN component, inspired by GG-NNs, utilizes node-wise attention to refine the aggregation process during message-passing iterations, culminating in a prediction of the binding affinity as final outputs.

Evaluation and Results

The model's efficacy is evaluated using the PDBbind v.2016 core set, a prominent benchmark for protein-ligand binding affinity prediction. HAC-Net achieves superior results, with a root-mean-square error (RMSE) of 1.205, outperforming other state-of-the-art models in the literature. This impressive performance is demonstrated across different evaluation metrics, including Pearson correlation and Spearman's rank correlation.

Moreover, the paper examines HAC-Net's generalizability through various training and testing scenarios that emphasize structural, sequential, and ligand fingerprint dissimilarities across datasets. The model's robustness is confirmed through 10-fold cross-validation using ligand SMILES string dissimilarities and by testing on lower-quality data, thereby asserting its broader applicability beyond high-quality crystal structures.

Implications and Future Directions

The architectural innovations of HAC-Net, particularly the incorporation of attention mechanisms into CNNs and GCNs, provide substantial contributions to deep learning applications in drug discovery. By reliably predicting binding affinities across diverse datasets, HAC-Net demonstrates potential for significant impact on computer-aided drug design and protein engineering. Its utility is further enhanced by the open availability of the model's software, facilitating reproducibility and further exploration within the scientific community.

Future work could explore integration with generative models for de novo drug design or extending HAC-Net to predict additional molecular properties, further broadening its applicability in biochemistry and pharmacology. Additionally, optimizing the model for dynamic protein-ligand interactions could unlock predictive capabilities in more complex biological contexts, aligning with the ongoing evolution of deep learning in life sciences.

PDF Markdown