Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PepGB: Facilitating peptide drug discovery via graph neural networks (2401.14665v1)

Published 26 Jan 2024 in q-bio.BM and cs.AI

Abstract: Peptides offer great biomedical potential and serve as promising drug candidates. Currently, the majority of approved peptide drugs are directly derived from well-explored natural human peptides. It is quite necessary to utilize advanced deep learning techniques to identify novel peptide drugs in the vast, unexplored biochemical space. Despite various in silico methods having been developed to accelerate peptide early drug discovery, existing models face challenges of overfitting and lacking generalizability due to the limited size, imbalanced distribution and inconsistent quality of experimental data. In this study, we propose PepGB, a deep learning framework to facilitate peptide early drug discovery by predicting peptide-protein interactions (PepPIs). Employing graph neural networks, PepGB incorporates a fine-grained perturbation module and a dual-view objective with contrastive learning-based peptide pre-trained representation to predict PepPIs. Through rigorous evaluations, we demonstrated that PepGB greatly outperforms baselines and can accurately identify PepPIs for novel targets and peptide hits, thereby contributing to the target identification and hit discovery processes. Next, we derive an extended version, diPepGB, to tackle the bottleneck of modeling highly imbalanced data prevalent in lead generation and optimization processes. Utilizing directed edges to represent relative binding strength between two peptide nodes, diPepGB achieves superior performance in real-world assays. In summary, our proposed frameworks can serve as potent tools to facilitate peptide early drug discovery.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Peptide therapeutics: current status and future directions. Drug discovery today, 20(1):122–128, 2015.
  2. Therapeutic peptides: Current applications and future directions. Signal Transduction and Targeted Therapy, 7(1):48, 2022.
  3. The discovery and development of liraglutide and semaglutide. Frontiers in endocrinology, 10:155, 2019.
  4. Principles of early drug discovery. British journal of pharmacology, 162(6):1239–1249, 2011.
  5. Galaxypepdock: a protein–peptide docking tool based on interaction similarity and energy optimization. Nucleic acids research, 43(W1):W431–W435, 2015.
  6. Mdockpep: An ab-initio protein–peptide docking server. Journal of computational chemistry, 39(28):2409–2413, 2018.
  7. Hpepdock: a web server for blind peptide–protein docking based on a hierarchical algorithm. Nucleic acids research, 46(W1):W443–W450, 2018.
  8. D-script translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions. Cell Systems, 12(10):969–982, 2021.
  9. Topsy-turvy: Integrating a global view into sequence-based ppi prediction. Bioinformatics, 38(Supplement_1):i264–i272, 2022.
  10. Monn: a multi-objective neural network for predicting compound-protein interactions and affinities. Cell Systems, 10(4):308–322, 2020.
  11. Deepdta: deep drug–target binding affinity prediction. Bioinformatics, 34(17):i821–i829, 2018.
  12. A deep learning framework to predict binding preference of rna constituents on protein surface. Nature communications, 10(1):4941, 2019.
  13. Bgfe: a deep learning model for ncrna-protein interaction predictions based on improved sequence information. International journal of molecular sciences, 20(4):978, 2019.
  14. A deep-learning framework for multi-level peptide–protein interaction prediction. Nature communications, 12(1):5465, 2021.
  15. Pepnn: a deep attention model for the identification of peptide binding sites. Communications biology, 5(1):503, 2022.
  16. De novo design of high-affinity binders of bioactive helical peptides. Nature, pages 1–3, 2023.
  17. Protein complex prediction with alphafold-multimer. biorxiv, pages 2021–10, 2021.
  18. Learning hierarchical protein representations via complete 3d graph networks. In The Eleventh International Conference on Learning Representations, 2022.
  19. Protein representation learning by geometric structure pretraining. arXiv preprint arXiv:2203.06125, 2022.
  20. Characterizing the interaction conformation between t-cell receptors and epitopes with deep learning. Nature Machine Intelligence, 5(4):395–407, 2023.
  21. Bias and debias in recommender system: A survey and future directions. ACM Transactions on Information Systems, 41(3):1–39, 2023.
  22. The impact of experimental and calculated error on the performance of affinity predictions. Journal of Chemical Information and Modeling, 62(3):703–717, 2022.
  23. The protein data bank. Nucleic acids research, 28(1):235–242, 2000.
  24. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nature Methods, 17(2):184–192, 2020.
  25. The biogrid interaction database: 2019 update. Nucleic acids research, 47(D1):D529–D541, 2019.
  26. Structural basis for high-affinity peptide inhibition of p53 interactions with mdm2 and mdmx. Proceedings of the National Academy of Sciences, 106(12):4665–4670, 2009.
  27. Systematic mutational analysis of peptide inhibition of the p53–mdm2/mdmx interactions. Journal of molecular biology, 398(2):200–213, 2010.
  28. Design of ultrahigh-affinity and dual-specificity peptide antagonists of mdm2 and mdmx for p53 activation and tumor suppression. Acta Pharmaceutica Sinica B, 11(9):2655–2669, 2021.
  29. Binary combinatorial scanning reveals potent poly-alanine-substituted inhibitors of protein-protein interactions. Communications Chemistry, 5(1):128, 2022.
  30. UniProt Consortium. Uniprot: a hub for protein information. Nucleic acids research, 43(D1):D204–D212, 2015.
  31. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  32. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1):4–24, 2020.
  33. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 6000–6010, Red Hook, NY, USA, 2017. Curran Associates Inc.
  34. Dropmessage: Unifying random dropping for graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 4267–4275, 2023.
  35. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng, 9(10), 2020.
  36. Large-scale robust deep auc maximization: A new surrogate loss and empirical studies on medical image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3040–3049, 2021.
  37. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821, 2021.
  38. Language models of protein sequences at the scale of evolution enable accurate structure prediction. bioRxiv, 2022.
  39. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  40. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
  41. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
  42. Rcsb protein data bank (rcsb. org): delivery of experimentally-determined pdb structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic acids research, 51(D1):D488–D508, 2023.
  43. UniProt Consortium. Uniprot: a worldwide hub of protein knowledge. Nucleic acids research, 47(D1):D506–D515, 2019.
  44. The biogrid database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Science, 30(1):187–200, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.