Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations (2402.10433v2)

Published 16 Feb 2024 in q-bio.BM, cs.LG, and q-bio.QM

Abstract: The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consuming molecular dynamics (MD) simulations in silico. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). However, being agnostic of the underlying energy landscape, the accuracy of such generative model may still be limited. In this work, we explore the few-shot setting of such pre-trained generative sampler which incorporates MD simulations in a tractable manner. Specifically, given a target protein of interest, we first acquire some seeding conformations from the pre-trained sampler followed by a number of physical simulations in parallel starting from these seeding samples. Then we fine-tuned the generative model using the simulation trajectories above to become a target-specific sampler. Experimental results demonstrated the superior performance of such few-shot conformation sampler at a tractable computational cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Protein structure and sequence generation with equivariant denoising diffusion probabilistic models. arXiv preprint arXiv:2205.15019, 2022.
  2. Brian DO Anderson. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12(3):313–326, 1982.
  3. Two for one: Diffusion models and force fields for coarse-grained molecular dynamics. arXiv preprint arXiv:2302.00600, 2023.
  4. Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373(6557):871–876, 2021.
  5. Se (3)-stochastic flow matching for protein backbone generation. arXiv preprint arXiv:2310.02391, 2023.
  6. Particle mesh ewald: An n log (n) method for ewald sums in large systems. The Journal of chemical physics, 98(12):10089–10092, 1993.
  7. Robust deep learning–based protein sequence design using proteinmpnn. Science, 378(6615):49–56, 2022.
  8. Openmm 8: Molecular dynamics simulation with machine learning potentials. The Journal of Physical Chemistry B, 2023.
  9. Conformational transition of sars-cov-2 spike glycoprotein between its closed and open states. The Journal of chemical physics, 153(7), 2020.
  10. Illuminating protein space with a programmable generative model. Nature, pp.  1–9, 2023.
  11. Direct generation of protein conformational ensembles via machine learning. Nature Communications, 14(1):774, 2023.
  12. Eigenfold: Generative protein structure prediction with diffusion models. arXiv preprint arXiv:2304.02198, 2023.
  13. Comparison of simple potential functions for simulating liquid water. The Journal of chemical physics, 79(2):926–935, 1983.
  14. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  15. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  16. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, 2023.
  17. How fast-folding proteins fold. Science, 334(6055):517–520, 2011.
  18. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1-3):503–528, 1989.
  19. Str2str: A score-based framework for zero-shot protein conformation sampling. In The Twelfth International Conference on Learning Representations, 2024.
  20. ff14sb: improving the accuracy of protein side chain and backbone parameters from ff99sb. Journal of chemical theory and computation, 11(8):3696–3713, 2015.
  21. Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: the case of domain motions. The Journal of chemical physics, 134(6), 2011.
  22. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365(6457):eaaw1147, 2019.
  23. Identification of slow molecular order parameters for markov model construction. The Journal of chemical physics, 139(1), 2013.
  24. Protein sequence and structure co-design with equivariant translation. arXiv preprint arXiv:2210.08761, 2022.
  25. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
  26. Diffusion probabilistic modeling of protein backbones in 3d for the motif-scaffolding problem. arXiv preprint arXiv:2206.04119, 2022.
  27. Alphafold2-rave: From sequence to boltzmann ranking. Journal of Chemical Theory and Computation, 2023.
  28. Loup Verlet. Computer” experiments” on classical fluids. i. thermodynamical properties of lennard-jones molecules. Physical review, 159(1):98, 1967.
  29. De novo design of protein structure and function with rfdiffusion. Nature, 620(7976):1089–1100, 2023.
  30. Protein structure generation via folding diffusion. arXiv preprint arXiv:2209.15611, 2022.
  31. Se (3) diffusion model with application to protein backbone generation. arXiv preprint arXiv:2302.02277, 2023.
  32. Unified efficient thermostat scheme for the canonical ensemble with holonomic or isokinetic constraints via molecular dynamics. The Journal of Physical Chemistry A, 123(28):6056–6079, 2019.
  33. Towards predicting equilibrium distributions for molecular systems with deep learning. arXiv preprint arXiv:2306.05445, 2023.

Summary

We haven't generated a summary for this paper yet.