Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-level Interaction Modeling for Protein Mutational Effect Prediction (2405.17802v1)

Published 28 May 2024 in cs.LG, cs.AI, and q-bio.BM

Abstract: Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different sidechain conformations, which lead to changes in the backbone conformation, eventually affecting the binding affinity between proteins. However, existing methods typically focus only on sidechain-level interaction modeling, resulting in suboptimal predictions. In this work, we propose a self-supervised multi-level pre-training framework, ProMIM, to fully capture all three levels of interactions with well-designed pretraining objectives. Experiments show ProMIM outperforms all the baselines on the standard benchmark, especially on mutations where significant changes in backbone conformations may occur. In addition, leading results from zero-shot evaluations for SARS-CoV-2 mutational effect prediction and antibody optimization underscore the potential of ProMIM as a powerful next-generation tool for developing novel therapeutic approaches and new drugs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. Towards a proteome-scale map of the human protein–protein interaction network. Nature, 437(7062):1173–1178, 2005.
  2. Transient protein–protein interactions. Protein Engineering, Design & Selection, 24(9):635–648, 2011.
  3. Proteomic and interactomic insights into the molecular basis of cell functional diversity. Nature Reviews Molecular Cell Biology, 21(6):327–340, 2020.
  4. Neutralizing and enhancing antibodies against sars-cov-2. Inflammation and Regeneration, 42(1):58, 2022.
  5. A strategy utilizing protein–protein interaction hubs for the treatment of cancer diseases. International Journal of Molecular Sciences, 24(22):16098, 2023.
  6. Mutation effects predicted from sequence co-variation. Nature biotechnology, 35(2):128–135, 2017.
  7. The foldx web server: an online force field. Nucleic acids research, 33(suppl_2):W382–W388, 2005.
  8. Beatmusic: prediction of changes in protein–protein binding affinity on mutations. Nucleic acids research, 41(W1):W333–W339, 2013.
  9. Bindprofx: assessing mutation-induced binding affinity change by protein interface profiles with pseudo-counts. Journal of molecular biology, 429(3):426–434, 2017.
  10. mcsm-ab: a web server for predicting antibody–antigen affinity changes upon mutation with graph-based signatures. Nucleic acids research, 44(W1):W469–W473, 2016.
  11. Mutabind estimates and interprets the effects of sequence variants on protein–protein interactions. Nucleic acids research, 44(W1):W494–W501, 2016.
  12. Improved protein structure prediction using potentials from deep learning. Nature, 577(7792):706–710, 2020.
  13. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  14. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature, pages 1–3, 2024.
  15. Deep geometric representations for modeling effects of mutations on protein-protein binding affinity. PLoS computational biology, 17(8):e1009284, 2021.
  16. Rotamer density estimator is an unsupervised learner of the effect of mutations on protein-protein interaction. In The Eleventh International Conference on Learning Representations, 2023.
  17. Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  18. mcsm-ppi2: predicting the effects of mutations on protein–protein interactions. Nucleic acids research, 47(W1):W338–W344, 2019.
  19. Structural assessment of the effects of amino acid substitutions on protein stability and protein-protein interaction. International journal of computational biology and drug design, 3(4):334–349, 2010.
  20. Implications of disease-related mutations at protein–protein interfaces. Current opinion in structural biology, 72:219–225, 2022.
  21. Topological data analysis of protein structure and inter/intra-molecular interaction changes attributable to amino acid mutations. Computational and Structural Biotechnology Journal, 21:2950–2959, 2023.
  22. Align before fuse: Vision and language representation learning with momentum distillation. Advances in neural information processing systems, 34:9694–9705, 2021.
  23. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  24. Drugclip: Contrasive protein-molecule representation learning for virtual screening. Advances in Neural Information Processing Systems, 36, 2024.
  25. Self-supervised pocket pretraining via protein fragment-surroundings alignment. arXiv preprint arXiv:2310.07229, 2023.
  26. Prediction of inter-chain distance maps of protein complexes with 2d attention-based deep neural networks. Nature communications, 13(1):6963, 2022.
  27. A unified approach to protein domain parsing with inter-residue distance matrix. Bioinformatics, 39(2):btad070, 2023.
  28. Skempi 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics, 35(3):462–469, 2019.
  29. Shifting mutational constraints in the sars-cov-2 receptor-binding domain during viral evolution. Science, 377(6604):420–424, 2022.
  30. Deep learning guided optimization of human antibody against sars-cov-2 variants with broad neutralization. Proceedings of the National Academy of Sciences, 119(11):e2122954119, 2022.
  31. The rosetta all-atom energy function for macromolecular modeling and design. Journal of chemical theory and computation, 13(6):3031–3048, 2017.
  32. Flex ddg: Rosetta ensemble-based estimation of changes in protein–protein binding affinity upon mutation. The Journal of Physical Chemistry B, 122(21):5389–5399, 2018.
  33. Learning inverse folding from millions of predicted structures. In International conference on machine learning, pages 8946–8970. PMLR, 2022.
  34. Learning to design protein-protein interactions with enhanced generalization. arXiv preprint arXiv:2310.18515, 2023.
  35. The two-hybrid system: an assay for protein-protein interactions. Trends in Genetics, 10(8):286–292, 1994.
  36. An efficient tandem affinity purification procedure for interaction proteomics in mammalian cells. Nature methods, 3(12):1013–1019, 2006.
  37. Sequence-based prediction of protein-protein interaction sites by simplified long short-term memory network. Neurocomputing, 357:86–100, 2019.
  38. Semignn-ppi: Self-ensembling multi-graph neural network for efficient and generalizable protein-protein interaction prediction. arXiv preprint arXiv:2305.08316, 2023.
  39. Improved prediction of protein-protein interactions using alphafold2. Nature communications, 13(1):1265, 2022.
  40. Hierarchical graph learning for protein–protein interaction. Nature Communications, 14(1):1093, 2023.
  41. Mape-ppi: Towards effective and efficient protein-protein interaction prediction via microenvironment-aware protein embedding. arXiv preprint arXiv:2402.14391, 2024.
  42. Graph-based prediction of protein-protein interactions with attributed signed graph embedding. BMC bioinformatics, 21:1–16, 2020.
  43. Learning unknown from correlations: graph neural network for inter-novel-protein interaction prediction. arXiv preprint arXiv:2105.06709, 2021.
  44. Mechanisms of protein oligomerization, the critical role of insertions and deletions in maintaining different oligomeric states. Proceedings of the National Academy of Sciences, 107(47):20352–20357, 2010.
  45. Predicting the impact of missense mutations on protein–protein binding affinity. Journal of chemical theory and computation, 10(4):1770–1780, 2014.
  46. Protein-protein and peptide-protein docking and refinement using attract in capri. Proteins: Structure, Function, and Bioinformatics, 85(3):391–398, 2017.
  47. Sharon Sunny and PB Jayaraj. Fpdock: Protein–protein docking using flower pollination algorithm. Computational Biology and Chemistry, 93:107518, 2021.
  48. Swarmdock: a server for flexible protein–protein docking. Bioinformatics, 29(6):807–809, 2013.
  49. Ilya A Vakser. Protein-protein docking: From interaction to interactome. Biophysical journal, 107(8):1785–1793, 2014.
  50. Independent se (3)-equivariant models for end-to-end rigid protein docking. arXiv preprint arXiv:2111.07786, 2021.
  51. Protein complex prediction with alphafold-multimer. biorxiv, pages 2021–10, 2021.
  52. Injecting multimodal information into rigid protein docking via bi-level optimization. Advances in Neural Information Processing Systems, 36, 2024.
  53. Diffdock-pp: Rigid protein-protein docking with diffusion models. arXiv preprint arXiv:2304.03889, 2023.
  54. Rigid protein-protein docking via equivariant elliptic-paraboloid interface prediction. arXiv preprint arXiv:2401.08986, 2024.
  55. Neural probabilistic protein-protein docking via a differentiable energy model. In The Twelfth International Conference on Learning Representations, 2023.
  56. Fast and accurate algorithms for protein side-chain packing. Journal of the ACM (JACM), 53(4):533–557, 2006.
  57. Improved prediction of protein side-chain conformations with scwrl4. Proteins: Structure, Function, and Bioinformatics, 77(4):778–795, 2009.
  58. Faspr: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics, 36(12):3758–3765, 2020.
  59. Macromolecular modeling and design in rosetta: recent methods and frameworks. Nature methods, 17(7):665–680, 2020.
  60. Dlpacker: Deep learning for prediction of amino acid side chain conformations in proteins. Proteins: Structure, Function, and Bioinformatics, 90(6):1278–1290, 2022.
  61. An end-to-end deep learning method for rotamer-free protein side-chain packing. bioRxiv, pages 2022–03, 2022.
  62. Diffpack: A torsional diffusion model for autoregressive protein side-chain packing. Advances in Neural Information Processing Systems, 36, 2024.
  63. Opus-mut: studying the effect of protein mutation through side-chain modeling. Journal of Chemical Theory and Computation, 19(5):1629–1640, 2023.
  64. Protein side-chain rearrangement in regions of point mutations. Proteins: Structure, Function, and Bioinformatics, 50(2):272–282, 2003.
  65. Neural spline flows. Advances in neural information processing systems, 32, 2019.
  66. Normalizing flows on tori and spheres. In International Conference on Machine Learning, pages 8083–8092. PMLR, 2020.
  67. The pdb_redo server for macromolecular structure model optimization. IUCrJ, 1(4):213–220, 2014.
  68. Language models enable zero-shot prediction of the effects of mutations on protein function. Advances in neural information processing systems, 34:29287–29303, 2021.
  69. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, 2023.
  70. End-to-end learning on 3d protein structure for interface prediction. Advances in Neural Information Processing Systems, 32, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuanle Mo (2 papers)
  2. Xin Hong (22 papers)
  3. Bowen Gao (14 papers)
  4. Yinjun Jia (10 papers)
  5. Yanyan Lan (87 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.