Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 113 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Full-Atom Peptide Design with Geometric Latent Diffusion (2402.13555v4)

Published 21 Feb 2024 in q-bio.BM

Abstract: Peptide design plays a pivotal role in therapeutics, allowing brand new possibility to leverage target binding sites that are previously undruggable. Most existing methods are either inefficient or only concerned with the target-agnostic design of 1D sequences. In this paper, we propose a generative model for full-atom \textbf{Pep}tide design with \textbf{G}eometric \textbf{LA}tent \textbf{D}iffusion (PepGLAD) given the binding site. We first establish a benchmark consisting of both 1D sequences and 3D structures from Protein Data Bank (PDB) and literature for systematic evaluation. We then identify two major challenges of leveraging current diffusion-based models for peptide design: the full-atom geometry and the variable binding geometry. To tackle the first challenge, PepGLAD derives a variational autoencoder that first encodes full-atom residues of variable size into fixed-dimensional latent representations, and then decodes back to the residue space after conducting the diffusion process in the latent space. For the second issue, PepGLAD explores a receptor-specific affine transformation to convert the 3D coordinates into a shared standard space, enabling better generalization ability across different binding shapes. Experimental Results show that our method not only improves diversity and binding affinity significantly in the task of sequence-structure co-design, but also excels at recovering reference structures for binding conformation generation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Rosettaantibodydesign (rabd): A general framework for computational antibody design. PLoS computational biology, 14(4):e1006112, 2018.
  2. The rosetta all-atom energy function for macromolecular modeling and design. Journal of chemical theory and computation, 13(6):3031–3048, 2017.
  3. Protein structure and sequence generation with equivariant denoising diffusion probabilistic models. arXiv preprint arXiv:2205.15019, 2022.
  4. Dockq: a quality measure for protein-protein docking models. PloS one, 11(8):e0161879, 2016.
  5. The protein data bank. Nucleic acids research, 28(1):235–242, 2000.
  6. Accurate de novo design of hyperstable constrained peptides. Nature, 538(7625):329–335, 2016.
  7. Evobind: in silico directed evolution of peptide binders with alphafold. bioRxiv, pp.  2022–07, 2022.
  8. Design of protein-binding proteins from the target structure alone. Nature, 605(7910):551–560, 2022.
  9. Principles of protein–protein recognition. Nature, 256(5520):705–708, 1975.
  10. Biopython: freely available python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11):1422, 2009.
  11. Cramér, H. Mathematical methods of statistics, volume 26. Princeton university press, 1999.
  12. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nature Biomedical Engineering, 5(6):613–623, 2021.
  13. Robust deep learning–based protein sequence design using proteinmpnn. Science, 378(6615):49–56, 2022.
  14. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  15. A hitchhiker’s guide to geometric gnns for 3d atomic systems. arXiv preprint arXiv:2312.07511, 2023.
  16. Peptide therapeutics: current status and future directions. Drug discovery today, 20(1):122–128, 2015.
  17. Matrix computations. JHU press, 2013.
  18. The x-pro peptide bond as an nmr probe for conformational studies of flexible linear peptides. Biopolymers: Original Research on Biomolecules, 15(10):2025–2041, 1976.
  19. Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Engineering, Design and Selection, 4(2):155–161, 1990.
  20. Geometrically equivariant graph neural networks: A survey. arXiv preprint arXiv:2202.07230, 2022.
  21. Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences, 89(22):10915–10919, 1992.
  22. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  23. Anchor extension: a structure-guided approach to design cyclic peptides targeting enzyme active sites. Nature Communications, 12(1):3384, 2021.
  24. Generative models for graph-based protein design. Advances in neural information processing systems, 32, 2019.
  25. Illuminating protein space with a programmable generative model. Nature, pp.  1–9, 2023.
  26. Iterative refinement graph neural network for antibody sequence-structure co-design. arXiv preprint arXiv:2110.04624, 2021.
  27. Antibody-antigen docking and design via hierarchical equivariant refinement. arXiv preprint arXiv:2207.06616, 2022.
  28. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  29. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  30. Conditional antibody design as 3d equivariant graph translation. arXiv preprint arXiv:2208.06073, 2022.
  31. End-to-end full-atom antibody design. arXiv preprint arXiv:2302.00203, 2023.
  32. A comprehensive review on current advances in peptide drug development and design. International journal of molecular sciences, 20(10):2383, 2019.
  33. A deep-learning framework for multi-level peptide–protein interaction prediction. Nature communications, 12(1):5465, 2021.
  34. Macromolecular modeling and design in rosetta: recent methods and frameworks. Nature methods, 17(7):665–680, 2020.
  35. The structural basis of peptide-protein binding strategies. Structure, 18(2):188–199, 2010.
  36. Rosetta flexpepdock web server—high resolution modeling of peptide–protein interactions. Nucleic acids research, 39(suppl_2):W249–W253, 2011.
  37. Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. Advances in Neural Information Processing Systems, 35:9754–9767, 2022.
  38. Abdiffuser: Full-atom generation of in-vitro functioning antibodies. arXiv preprint arXiv:2308.05027, 2023.
  39. Mitternacht, S. Freesasa: An open source c library for solvent accessible surface area calculations. F1000Research, 5, 2016.
  40. Recurrent neural network model for constructive peptide design. Journal of chemical information and modeling, 58(2):472–479, 2018.
  41. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular biology, 48(3):443–453, 1970.
  42. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pp.  8162–8171. PMLR, 2021.
  43. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10684–10695, 2022.
  44. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp.  2256–2265. PMLR, 2015.
  45. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  46. Mmseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature biotechnology, 35(11):1026–1028, 2017.
  47. Tertiary motifs as building blocks for the design of protein-binding peptides. Protein Science, 31(6):e4322, 2022.
  48. Diffusion probabilistic modeling of protein backbones in 3d for the motif-scaffolding problem. arXiv preprint arXiv:2206.04119, 2022.
  49. Harnessing protein folding neural networks for peptide–protein docking. Nature communications, 13(1):176, 2022.
  50. Computational design of peptide ligands. Trends in biotechnology, 29(5):231–239, 2011.
  51. Abode: Ab initio antibody design using conjoined odes. arXiv preprint arXiv:2306.01005, 2023.
  52. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of machine learning research, 11(12), 2010.
  53. Accelerating antimicrobial peptide discovery with latent sequence-structure model. arXiv preprint arXiv:2212.09450, 2022.
  54. De novo design of protein structure and function with rfdiffusion. Nature, 620(7976):1089–1100, 2023.
  55. Pepbdb: a comprehensive structural database of biological peptide–protein interactions. Bioinformatics, 35(1):175–177, 2019.
  56. Comprehensive evaluation of fourteen docking programs on protein–peptide complexes. Journal of chemical theory and computation, 16(6):3959–3969, 2020.
  57. Protein structure generation via folding diffusion. 2022.
  58. Computational prediction of mhc anchor locations guides neoantigen identification and prioritization. Science immunology, 8(82):eabg2200, 2023.
  59. Helixgan a deep-learning methodology for conditional de novo design of α𝛼\alphaitalic_α-helix structures. Bioinformatics, 39(1):btad036, 2023.
  60. Helixdiff: Hotspot-specific full-atom design of peptides using diffusion models.
  61. Geodiff: A geometric diffusion model for molecular conformation generation. arXiv preprint arXiv:2203.02923, 2022.
  62. Geometric latent diffusion models for 3d molecule generation. In International Conference on Machine Learning, pp.  38592–38610. PMLR, 2023.
  63. Se (3) diffusion model with application to protein backbone generation. arXiv preprint arXiv:2302.02277, 2023.
  64. Diffpack: A torsional diffusion model for autoregressive protein side-chain packing. arXiv preprint arXiv:2306.01794, 2023.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 4 posts and received 45 likes.