Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations (2205.08306v1)

Published 17 May 2022 in physics.chem-ph, cs.LG, and q-bio.BM

Abstract: Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes. Accurate MD simulations require computationally demanding quantum-mechanical calculations, being practically limited to short timescales and few atoms. For larger systems, efficient, but much less reliable empirical force fields are used. Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations, offering similar accuracy as ab initio methods at orders-of-magnitude speedup. Until now, MLFFs mainly capture short-range interactions in small molecules or periodic materials, due to the increased complexity of constructing models and obtaining reliable reference data for large molecules, where long-ranged many-body effects become important. This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations (GEMS) by training on "bottom-up" and "top-down" molecular fragments of varying size, from which the relevant physicochemical interactions can be learned. GEMS is applied to study the dynamics of alanine-based peptides and the 46-residue protein crambin in aqueous solution, allowing nanosecond-scale MD simulations of >25k atoms at essentially ab initio quality. Our findings suggest that structural motifs in peptides and proteins are more flexible than previously thought, indicating that simulations at ab initio accuracy might be necessary to understand dynamic biomolecular processes such as protein (mis)folding, drug-protein binding, or allosteric regulation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Oliver T. Unke (24 papers)
  2. Martin Stöhr (8 papers)
  3. Stefan Ganscha (3 papers)
  4. Thomas Unterthiner (24 papers)
  5. Hartmut Maennel (10 papers)
  6. Sergii Kashubin (4 papers)
  7. Daniel Ahlin (2 papers)
  8. Michael Gastegger (27 papers)
  9. Leonardo Medrano Sandonas (8 papers)
  10. Alexandre Tkatchenko (94 papers)
  11. Klaus-Robert Müller (167 papers)
Citations (19)

Summary

Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

The development and application of machine learning force fields (MLFFs) present a significant advancement in the field of molecular dynamics (MD) simulations, providing enhanced accuracy akin to quantum-mechanical calculations while offering a drastically reduced computational cost. This paper introduces a systematic approach for constructing accurate MLFFs suitable for large-scale molecular simulations, such as proteins, utilizing a technique named GEMS (General approach to Machine-learned force fields for large Scale Biomolecular Simulations).

Overview of Methodology

The pivotal innovation in this work lies in the efficient generation of reference data through the use of "bottom-up" and "top-down" fragmentations. The bottom-up approach generates small molecular structures, representing the typical bonding environments found within the molecule of interest, allowing for a thorough exploration of the potential energy surface (PES) through numerous conformations. Conversely, the top-down strategy focuses on constructing fragments by segmenting the molecule of interest into larger pieces, which ensures the capture of long-range interactions crucial for reproducing the structural properties accurately. Combined, these strategies ensure that the complex intermolecular forces within large systems, such as proteins, are adequately modeled, facilitating the extension of MLFFs to systems previously intractable for quantum calculations.

Key Findings

The application of MLFFs through GEMS yields notable advancements in the paper of biomolecular dynamics. When applied to poly-alanine peptides, GEMS accurately replicated the stabilization energy shifts in alpha-helices due to cooperative hydrogen bonding interactions. Furthermore, GEMS revealed more flexible structural conformations of the proteins in solution, which were previously underestimated by conventional force fields such as AMBER99SB-ILDN.

The simulations of a protein system, crambin, indicated that while traditional FFs predict relatively rigid protein structures, GEMS suggests significantly greater flexibility. Notably, the distribution of backbone dihedral angles was broader, which resonates better with experimental results. Moreover, the stark differences short- and long-term fluctuations observed in GEMS simulations underscore the critical advantage of utilizing MLFFs for capturing more nuanced molecular dynamics, which could be pivotal in the contexts of protein folding and enzyme activity analysis.

Implications and Future Directions

The findings stress the necessity of simulations at ab initio accuracy levels to understand dynamic biomolecular phenomena fully. This enhanced understanding could significantly impact fields that rely heavily on protein dynamics, such as drug discovery, where accurate insights into protein flexibility and interaction sites are essential. The potential extension of GEMS to simulate larger systems or longer timescales could pave the way for comprehensive modeling of even more complex biomolecular mechanisms, potentially coupled with advancements in computational resources.

Furthermore, GEMS makes a significant stride toward democratizing access to quantum-level accuracy in simulative studies across biochemistry and related fields. The ability to predict dynamical systems' properties using machine learning and without the prohibitive costs associated with traditional quantum mechanical methods allows for broader research scopes and potentially faster translational applications in experimental and clinical settings.

As this methodology matures, there remains a noteworthy opportunity to leverage it for studying novel biomolecular systems and conducting exploratory research that may not have been previously feasible. Additionally, the generalizability of the approach implies that it could aid in the discovery and optimization of biochemical systems beyond those studied presently, thus making this work a significant contribution to the advancement of computational chemistry and molecular biology.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com