Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Machine-learned molecular mechanics force field for the simulation of protein-ligand systems and beyond (2307.07085v4)

Published 13 Jul 2023 in physics.chem-ph and cs.AI

Abstract: The development of reliable and extensible molecular mechanics (MM) force fields -- fast, empirical models characterizing the potential energy surface of molecular systems -- is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, \texttt{espaloma-0.3}, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1M energy and force calculations, \texttt{espaloma-0.3} reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (136)
  1. Amber force field parameters for the naturally occurring modified nucleosides in rna. Journal of chemical theory and computation, 3(4):1464–1475.
  2. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261.
  3. Machine learning directed optimization of classical molecular modeling force fields. Journal of Chemical Information and Modeling, 61(9):4400–4414.
  4. openforcefield/openff-forcefields (2023.05.1). Zenodo. https://doi.org/10.5281/zenodo.7889050.
  5. Benchmarking qm theory for drug-like molecules to train force fields. OpenEye CUP XII, Santa Fe, NM. Zenodo. https://doi.org/10.5281/zenodo.7548777.
  6. A practical guide to large-scale docking. Nature protocols, 16(10):4799–4832.
  7. Pressure control using stochastic cell rescaling. The Journal of Chemical Physics, 153:114107.
  8. Paramfit: Automated optimization of force field parameters for molecular dynamics simulations. Journal of computational chemistry, 36(2):79–87.
  9. Development and benchmarking of open force field 2.0.0: The sage small molecule force field. Journal of Chemical Theory and Computation, 19(11):3251–3275.
  10. Improving force field accuracy by training against condensed-phase mixture properties. Journal of Chemical Information and Modeling, 18(6):3577–3592.
  11. Open force field evaluator: An automated, efficient, and scalable framework for the estimation of physical properties from molecular simulation. Journal of Chemical Theory and Computation, 18(6):3566––3576.
  12. Optimized lennard-jones parameters for druglike small molecules. Journal of chemical theory and computation, 14(6):3121–3131.
  13. Amber 2023.
  14. Development and benchmarking of an open, self-consistent force field for proteins and small molecules from the open force field initiative. Zenodo. https://doi.org/10.5281/zenodo.7696579.
  15. choderalab/openmmtools: 0.22.1 (0.22.1). Zenodo. https://doi.org/10.5281/zenodo.7843902.
  16. openmm/openmm-forcefields: Fix gaff am1-bcc charging bug for some molecules (0.7.1). Zenodo. https://doi.org/10.5281/zenodo.3627391.
  17. Replica exchange and expanded ensemble simulations as gibbs sampling: Simple improvements for enhanced mixing. The Journal of chemical physics, 135(19):194110.
  18. The nucleic acid database: new features and capabilities. Nucleic acids research, 42(D1):D114–D122.
  19. Exhaustive conformational sampling of complex fused ring macrocycles using inverse kinematics. Journal of chemical theory and computation, 12(9):4674–4687.
  20. Collaborative assessment of molecular geometries and energies from the open force field. Journal of Chemical Information and Modeling, 62(23):6094–6104.
  21. Biomolecular force fields: where have we been, where are we now, where do we need to go and how do we get there? Journal of computer-aided molecular design, 33(2):133–203.
  22. Structure-based design of a potent purine-based cyclin-dependent kinase inhibitor. Nature structural biology, 9(10):745–749.
  23. Inadequacy of the lorentz-berthelot combining rules for accurate predictions of equilibrium properties by molecular simulation. Molecular Physics, 99(8):619–625.
  24. Atomic-resolution conformational analysis of the gm3 ganglioside in a lipid bilayer and its implications for ganglioside–protein recognition at membrane surfaces. Glycobiology, 19(4):344–355.
  25. Presentation of membrane-anchored glycosphingolipids determined from molecular dynamics simulations and nmr paramagnetic relaxation rate enhancement. Journal of the American Chemical Society, 132(4):1334–1338.
  26. Protein backbone 1hn 13calpha and 15n 13calpha residual dipolar and j couplings: New constraints for nmr structure determination. Journal of the American Chemical Society, 126(20):6232–6233.
  27. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Scientific data, 8(1):55.
  28. Topology Adaptive Graph Convolutional Networks. arXiv:1710.10370 [cs, stat].
  29. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pages 2224–2232.
  30. Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Scientific Data, 10(1):11.
  31. Openmm 8: Molecular dynamics simulation with machine learning potentials. arXiv preprint arXiv:2310.03121.
  32. Openmm 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7):e1005659.
  33. Discovery of potent myeloid cell leukemia 1 (mcl-1) inhibitors using fragment-based methods and structure-based design. Journal of medicinal chemistry, 56(1):15–30.
  34. Forces are not enough: Benchmark and critical evaluation for machine learning force fields with molecular simulations. arXiv preprint arXiv:2210.07237.
  35. Assessing the current state of amber force field modifications for dna. Journal of chemical theory and computation, 12(8):4114–4127.
  36. Pre-exascale computing of protein–ligand binding free energies with open source software for drug design. Journal of chemical information and modeling, 62(5):1172–1177.
  37. pmx: Automated protein structure and topology generation for alchemical perturbations. Journal of Computational Chemistry, 19(5):348–354.
  38. A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342.
  39. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR.
  40. Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. Journal of chemical information and computer sciences, 43(6):1982–1997.
  41. Lipid17: A comprehensive amber force field for the simulation of zwitterionic and anionic lipids. Manuscript in preparation.
  42. Structure and dynamics of the homologous series of alanine peptides: A joint molecular dynamics/nmr study. Journal of the American Chemical Society, 129(5):1179–1189.
  43. Intrinsic propensities of amino acid residues in gxg peptides inferred from amide i’ band profiles and nmr scalar coupling constants. Journal of the American Chemical Society, 132(2):540–551.
  44. Hagler, A. T. (2019). Force field development phase ii: Relaxation of physics-based criteria… or inclusion of more rigorous physics into the representation of molecular energetics. Journal of computer-aided molecular design, 33(2):205–264.
  45. Halgren, T. A. (1996). Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of Computational Chemistry, 17(5-6):490–519.
  46. Inductive representation learning on large graphs. In Advances in neural information processing systems, pages 1024–1034.
  47. Opls3: a force field providing broad coverage of drug-like small molecules and proteins. Journal of chemical theory and computation, 12(1):281–296.
  48. Acemd: accelerating biomolecular dynamics in the microsecond time scale. Journal of chemical theory and computation, 5(6):1632–1639.
  49. A fast and high-quality charge model for the next generation general amber force field. The Journal of Chemical Physics, 153:114502.
  50. Determination of psi torsion angle restraints from 3j(calpha,calpha) and 3j(calpha,hn) coupling constants in proteins. Journal of the American Chemical Society, 122(26):6268–6277.
  51. Uncertainty quantification using neural networks for molecular property prediction. Journal of Chemical Information and Modeling, 60(8):3770–3780.
  52. Development of an improved four-site water model for biomolecular simulations: Tip4p-ew. The Journal of chemical physics, 120(20):9665–9678.
  53. Horton, J. (2022). openforcefield/openff-qcsubmit: 0.3.1 (0.3.1). Zenodo. https://doi.org/10.5281/zenodo.6338096.
  54. Open force field bespokefit: Automating bespoke torsion parametrization at scale. Journal of Chemical Information and Modeling, 62(22):5622–5633.
  55. Determination of phi and chi angles in proteins from 13c - 13c three-bond j couplings measured by three-dimensional heteronuclear nmr. how planar is the peptide bond? Journal of the American Chemical Society, 119(27):6360–6368.
  56. Building water models: A different approach. The Journal of Physical Chemistry Letters, 5(21):3863–3871.
  57. Fast, efficient generation of high-quality atomic charges. am1-bcc model: I. method. Journal of computational chemistry, 21(2):132–146.
  58. Fast, efficient generation of high-quality atomic charges. am1-bcc model: Ii. parameterization and validation. Journal of computational chemistry, 23(16):1623–1641.
  59. Comparison of simple potential functions for simulating liquid water. The Journal of chemical physics, 79(2):926–935.
  60. On the expressive power of geometric graph neural networks. arXiv preprint arXiv:2301.09308.
  61. Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. The journal of physical chemistry B, 112(30):9020–9041.
  62. Molecular dynamics simulations of the dynamic and energetic properties of alkali and halide ions using water-model-specific ion parameters. The Journal of Physical Chemistry B, 113(40):13279–13290.
  63. Improvements to the apbs biomolecular solvation software suite. Protein Science, 27(1):112–128.
  64. Karplus, M. (1963). Vicinal proton coupling in nuclear magnetic resonance. Journal of the American Chemical Society, 85(18):2870–2871.
  65. Forcefield_ptm: Ab initio charge and amber forcefield parameters for frequently occurring post-translational modifications. Journal of chemical theory and computation, 9(12):5653–5674.
  66. Pubchem 2023 update. Nucleic Acids Research, 51(D1):D1373–D1380.
  67. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  68. Semi-supervised classification with graph convolutional networks. CoRR, abs/1609.02907.
  69. Glycam06: a generalizable biomolecular force field. carbohydrates. Journal of computational chemistry, 29(4):622–655.
  70. Optimizing simulations protocols for relative free energy calculations. In Free Energy Methods in Drug Discovery: Current State and Future Directions, pages 227–245. ACS Publications.
  71. rdkit/rdkit: 2023_03_2 (q1 2023) release (release_2023_03_2). Zenodo. https://doi.org/10.5281/zenodo.8053810.
  72. Leach, A. R. (2001). Molecular modelling: principles and applications. Pearson education.
  73. Efficient molecular dynamics using geodesic integration and solvent–solute splitting. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 472(2189):20160138.
  74. Taking into account the ion-induced dipole interaction in the nonbonded model of ions. Journal of chemical theory and computation, 10(1):289–297.
  75. Rational design of particle mesh ewald compatible lennard-jones parameters for+ 2 metal cations in explicit solvent. Journal of chemical theory and computation, 9(6):2733–2748.
  76. Parameterization of highly charged metal ions using the 12-6-4 lj-type nonbonded model in explicit water. The Journal of Physical Chemistry B, 119(3):883–895.
  77. Pubchem as a public resource for drug discovery. Drug discovery today, 15(23-24):1052–1057.
  78. Lead identification of novel and selective tyk2 inhibitors. European journal of medicinal chemistry, 67:175–187.
  79. Benchmark assessment of molecular geometries and energies from small molecule force fields. F1000Research, 9:1390.
  80. openforcefield/openff-arsenic: v0.2.1 (0.2.1). Zenodo. https://doi.org/10.5281/zenodo.6210305.
  81. ff14sb: improving the accuracy of protein side chain and backbone parameters from ff99sb. Journal of chemical theory and computation, 11(8):3696–3713.
  82. Best practices for alchemical free energy calculations [article v1. 0]. Living journal of computational molecular science, 2(1).
  83. Escaping atom types in force fields using direct chemical perception. Journal of chemical theory and computation, 14(11):6076–6092.
  84. Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs. CoRR, abs/1811.01900.
  85. Optimizing protein- solvent force fields to reproduce intrinsic conformational preferences of model peptides. Journal of Chemical Theory and Computation, 7(4):1220–1230.
  86. Folding simulations for proteins with diverse topologies are accessible in days with a physics-based force field and implicit solvent. Journal of the American Chemical Society, 136(40):13959–13962.
  87. The rna 3d motif atlas: Computational methods for extraction, organization and evaluation of rna motifs. Methods, 103:99–119.
  88. Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.
  89. Best practices for foundations in molecular simulations [article v1.0]. Living Journal of Computational Molecular Science, 1:1–28.
  90. Pepconf, a diverse data set of peptide conformational energies. Scientific data, 6(1):1–9.
  91. Development and benchmarking of open force field v1.0.0—the Parsley small-molecule force field. Journal of chemical theory and computation, 17(10):6262–6280.
  92. RDKit, online (2013). RDKit: Open-source cheminformatics. http://www.rdkit.org. [Online; accessed 11-April-2013].
  93. Lightweight object oriented structure analysis: tools for building tools to analyze molecular dynamics simulations. Journal of Computational Chemistry, 35(32):2305–2318.
  94. Perses (0.10.1). Zenodo. https://doi.org/10.5281/zenodo.6757402.
  95. Routine microsecond molecular dynamics simulations with amber on gpus. 2. explicit solvent particle mesh ewald. Journal of chemical theory and computation, 9(9):3878–3888.
  96. Large-scale assessment of binding free energy calculations in active drug discovery projects. Journal of Chemical Information and Modeling, 60(11):5457–5474.
  97. Schlick, T. (2010). Molecular modeling and simulation: an interdisciplinary guide, volume 2. Springer.
  98. Tfd: Torsion fingerprints as a new measure to compare small molecule conformations. Journal of Chemical Information and Modeling, 52(6):1499–1512.
  99. Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks. Nature Communications, 12(5104).
  100. Tuning potential functions to host–guest binding data.
  101. Statistically optimal analysis of samples from multiple equilibrium states. The Journal of chemical physics, 129(12):124105.
  102. The molssi qcarchive project: An open-source platform to compute, organize, and share quantum chemistry data. Wiley Interdisciplinary Reviews: Computational Molecular Science, 11(2):e1491.
  103. Psi4 1.4: Open-source software for high-throughput quantum chemistry. The Journal of chemical physics, 152(18).
  104. Less is more: Sampling chemical space with active learning. The Journal of Chemical Physics, 148:241733.
  105. Improved treatment of ligands and coupling effects in empirical calculation and rationalization of p k a values. Journal of Chemical Theory and Computation, 7(7):2284–2295.
  106. Regularized by physics: Graph neural network parametrized potentials for the description of intermolecular interactions. Journal of Chemical Theory and Computation, 19(2):562–579.
  107. ff19sb: Amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. Journal of chemical theory and computation, 16(1):528–552.
  108. Affordable membrane permeability calculations: permeation of short-chain alcohols through pure-lipid bilayers and a mammalian cell membrane. Journal of Chemical Theory and Computation, 15(5):2913–2924.
  109. Limits on variations in protein backbone dynamics from precise measurements of scalar couplings. Journal of the American Chemical Society, 129(30):9377–9385.
  110. openforcefield/openff-forcefields: Version 2.0.0 "sage" (2.0.0). Zenodo. https://doi.org/10.5281/zenodo.5214478.
  111. openforcefield/openff-toolkit: 0.10.6 bugfix release (0.10.6). Zenodo. https://doi.org/10.5281/zenodo.6483648.
  112. Automatic atom type and bond type perception in molecular mechanical calculations. Journal of molecular graphics and modelling, 25(2):247–260.
  113. Development and testing of a general amber force field. Journal of computational chemistry, 25(9):1157–1174.
  114. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. Journal of the American Chemical Society, 137(7):2695–2703.
  115. Building force fields: An automatic, systematic, and reproducible approach. The journal of physical chemistry letters, 5(11):1885–1891.
  116. Building a more predictive protein force field: A systematic and reproducible route to amber-fb15. The Journal of Physical Chemical B, 121(16):4023–4039.
  117. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315.
  118. Dmff: An open-source automatic differentiable platform for molecular force field development and molecular dynamics simulation. Journal of Chemical Theory and Computation, 19(17):5897–5909.
  119. Wang, Y. (2023). Graph Machine Learning for (Bio)Molecular Modeling and Force Field Construction. PhD thesis. Copyright - Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works; Last updated - 2023-03-24.
  120. Spatial attention kinetic networks with e(n)-equivariance.
  121. End-to-end differentiable construction of molecular mechanics force fields. Chem. Sci., 13:12016–12033.
  122. Graph nets for partial charge prediction. arXiv preprint arXiv:1909.07903.
  123. Stochastic aggregation in graph neural networks.
  124. Espalomacharge: Machine learning-enabled ultra-fast partial charge assignment. arXiv preprint arXiv:2302.06758.
  125. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12.
  126. Denoise pretraining on nonequilibrium molecules for accurate and transferable neural potentials. Journal of Chemical Theory and Computation, 19(15):5077–5087.
  127. Evaluating the performance of the ff99sb force field based on nmr scalar coupling data. Biophysical Journal, 97(3):853–856.
  128. Fitting quantum machine learning potentials to experimental free energy data: Predicting tautomer ratios in solution. Chemical science, 12(34):11364–11381.
  129. Teaching free energy calculations to learn from experimental data. bioRxiv, pages 2021–08.
  130. Angular dependence of 1j(ni,calphai) and 2j(ni,calpha(i-1)) coupling constants measured in j-modulated hsqcs. Journal of Biomolecular NMR, 23:47–55.
  131. Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153.
  132. Xu, H. (2019). Optimal measurement network of pairwise differences. Journal of Chemical Information and Modeling, 59(11):4720–4728.
  133. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.
  134. Refinement of the cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. Journal of Chemical Theory and Computation, 7(9):2886–2902.
  135. Refinement of the sugar–phosphate backbone torsion beta for amber force fields improves the description of z-and b-dna. Journal of chemical theory and computation, 11(12):5723–5736.
  136. Unified efficient thermostat scheme for the canonical ensemble with holonomic or isokinetic constraints via molecular dynamics. The Journal of Physical Chemistry A, 123(28):6056–6079.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com