CoGANPPIS: A Coevolution-enhanced Global Attention Neural Network for Protein-Protein Interaction Site Prediction (2303.06945v4)
Abstract: Protein-protein interactions are of great importance in biochemical processes. Accurate prediction of protein-protein interaction sites (PPIs) is crucial for our understanding of biological mechanism. Although numerous approaches have been developed recently and achieved gratifying results, there are still two limitations: (1) Most existing models have excavated a number of useful input features, but failed to take coevolutionary features into account, which could provide clues for inter-residue relationships; (2) The attention-based models only allocate attention weights for neighboring residues, instead of doing it globally, which may limit the model's prediction performance since some residues being far away from the target residues might also matter. We propose a coevolution-enhanced global attention neural network, a sequence-based deep learning model for PPIs prediction, called CoGANPPIS. Specifically, CoGANPPIS utilizes three layers in parallel for feature extraction: (1) Local-level representation aggregation layer, which aggregates the neighboring residues' features as the local feature representation; (2) Global-level representation learning layer, which employs a novel coevolution-enhanced global attention mechanism to allocate attention weights to all residues on the same protein sequences; (3) Coevolutionary information learning layer, which applies CNN & pooling to coevolutionary information to obtain the coevolutionary profile representation. Then, the three outputs are concatenated and passed into several fully connected layers for the final prediction. Extensive experiments on two benchmark datasets have been conducted, demonstrating that our proposed model achieves the state-of-the-art performance.
- Using cooperatively folded peptides to measure interaction energies and conformational propensities. Accounts of chemical research, 50(8):1875–1882, 2017.
- Introduction to protein structure. Garland Science, 2012.
- Protein stickiness, rather than number of functional protein-protein interactions, predicts expression noise and plasticity in yeast. BMC systems biology, 6(1):1–10, 2012.
- Machine learning solutions for predicting protein–protein interactions. Wiley Interdisciplinary Reviews: Computational Molecular Science, page e1618, 2022.
- Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information. BMC bioinformatics, 11(1):1–15, 2010.
- Detection of outlier residues for improving interface prediction in protein heterocomplexes. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(4):1155–1165, 2012.
- Sequence-based prediction of protein interaction sites with an integrative method. Bioinformatics, 25(5):585–591, 2009.
- On position-specific scoring matrix for protein function prediction. IEEE/ACM transactions on computational biology and bioinformatics, 8(2):308–315, 2010.
- Emerging methods in protein co-evolution. Nature Reviews Genetics, 14(4):249–261, 2013.
- Sequence-based prediction of protein–protein interaction sites with l1-logreg classifier. Journal of theoretical biology, 348:47–54, 2014.
- Improved contact prediction in proteins: using pseudolikelihoods to infer potts models. Physical Review E, 87(1):012707, 2013.
- Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nature Methods, 17(2):184–192, 2020.
- Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks. Bioinformatics, 34(23):4039–4045, 2018.
- Netsurfp-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning. Nucleic Acids Research, 2022.
- Club-martini: selecting favourable interactions amongst available candidates, a coarse-grained simulation approach to scoring docking decoys. PloS one, 11(5):e0155251, 2016.
- Sgppi: structure-aware prediction of protein–protein interactions in rigorous conditions with graph convolutional network. Briefings in Bioinformatics, page bbad020, 2023.
- Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
- Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLoS computational biology, 17(3):e1008865, 2021a.
- Delphi: accurate deep ensemble model for protein interaction sites prediction. Bioinformatics, 37(7):896–904, 2021b.
- Attention-based convolutional neural networks for protein-protein interaction site prediction. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 141–144. IEEE, 2021.
- Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proceedings of the National Academy of Sciences, 108(49):E1293–E1301, 2011.
- Applying the naïve bayes classifier with kernel density estimation to the prediction of protein–protein interaction sites. Bioinformatics, 26(15):1841–1848, 2010.
- Harper’s illustrated biochemistry. 28. Citeseer, New York, United States, 2009.
- Intpred: a structure-based predictor of protein–protein interaction sites. Bioinformatics, 34(2):223–229, 2018.
- Predicted protein–protein interaction sites from local sequence information. FEBS letters, 544(1-3):236–239, 2003.
- Isis: interaction sites identified from sequence. Bioinformatics, 23(2):e13–e16, 2007.
- Prediction-based fingerprints of protein–protein interactions. Proteins: Structure, Function, and Bioinformatics, 66(3):630–645, 2007.
- Springs: prediction of protein-protein interaction sites using artificial neural networks. Technical report, PeerJ PrePrints, 2014.
- Intragenic complementation and oligomerization of the l subunit of the sendai virus rna polymerase. Virology, 304(2):235–245, 2002.
- Pipenn: protein interface prediction from sequence with an ensemble of neural nets. Bioinformatics, 38(8):2111–2118, 2022.
- Bridging the gap between the sequence and 3d structure world. Structural Biology and Functional Genomics, 71:251, 1999.
- Prediction of protein-protein interaction sites based on stratified attentional mechanisms. Frontiers in Genetics, page 2097, 2021.
- Dynamic proteomics in modeling of the living cell. protein-protein interactions. Biochemistry (Moscow), 74(13):1586–1607, 2009.
- Predicting protein interaction sites from residue spatial sequence profile and evolution rate. FEBS letters, 580(2):380–384, 2006.
- Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS computational biology, 13(1):e1005324, 2017.
- Protein–protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique. Bioinformatics, 35(14):2395–2402, 2019.
- Identification of direct residue contacts in protein–protein interaction by message passing. Proceedings of the National Academy of Sciences, 106(1):67–72, 2009.
- Identification of protein interactions involved in cellular signaling. Molecular & Cellular Proteomics, 12(7):1752–1763, 2013.
- A two-stage classifier for identification of protein–protein interface residues. Bioinformatics, 20(suppl_1):i371–i378, 2004.
- Structure-aware protein–protein interaction site prediction using deep graph convolutional network. Bioinformatics, 38(1):125–132, 2022.
- Protein–protein interaction site prediction through combining local and global features with deep neural networks. Bioinformatics, 36(4):1114–1120, 2020.
- Scriber: accurate and partner type-specific prediction of protein-binding residues from proteins sequences. Bioinformatics, 35(14):i343–i353, 2019.