A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics (2401.15122v3)
Abstract: In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. There has been a long history of improving the efficiency of MD simulations through better numerical methods and, more recently, by utilizing ML methods. Yet, challenges remain, such as accurate modeling of extended-timescale simulations. To address this issue, we propose NeuralMD, the first ML surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding dynamics. We propose a principled approach that incorporates a novel physics-informed multi-grained group symmetric framework. Specifically, we propose (1) the BindingNet model that satisfies group symmetry using vector frames and captures the multi-level protein-ligand interactions, and (2) an augmented neural differential equation solver that learns the trajectory under Newtonian mechanics. For the experiment, we design ten single-trajectory and three multi-trajectory binding simulation tasks. We demonstrate the efficiency and effectiveness of NeuralMD, achieving over 1K$\times$ speedup compared to standard numerical MD simulations. NeuralMD also outperforms all other ML approaches, achieving up to 15$\times$ reduction in reconstruction error and 70% increase in validity. Additionally, we qualitatively illustrate that the oscillations in the predicted trajectories align more closely with ground-truth dynamics than those of other machine-learning methods. We believe NeuralMD paves the foundation for a new research paradigm in simulating protein-ligand dynamics.
- “Binding affinity in drug design: experimental and computational techniques” In Expert opinion on drug discovery 14.8 Taylor & Francis, 2019, pp. 755–768
- Jincai Yang, Cheng Shen and Niu Huang “Predicting or pretending: artificial intelligence for protein-ligand interactions lack of sufficiently large and unbiased datasets” In Frontiers in pharmacology 11 Frontiers Media SA, 2020, pp. 69
- “On the frustration to predict binding affinities from protein–ligand structures with deep neural networks” In Journal of medicinal chemistry 65.11 ACS Publications, 2022, pp. 7946–7958
- “MISATO-Machine learning dataset of protein-ligand complexes for structure-based drug discovery” In bioRxiv Cold Spring Harbor Laboratory, 2023, pp. 2023–05
- “Cryptic binding sites on proteins: definition, detection, and druggability” In Current opinion in chemical biology 44 Elsevier, 2018, pp. 1–8
- “Forces are not enough: Benchmark and critical evaluation for machine learning force fields with molecular simulations” In arXiv preprint arXiv:2210.07237, 2022
- “Highly accurate protein structure prediction with AlphaFold” In Nature 596.7873 Nature Publishing Group, 2021, pp. 583–589
- “Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials” In arXiv preprint arXiv:2306.09375, 2023
- “Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics” In Physical review letters 120.14 APS, 2018, pp. 143001
- “Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size” In arXiv preprint arXiv:2304.10061, 2023
- “Simulate time-integrated coarse-grained molecular dynamics with geometric machine learning” In arXiv preprint arXiv:2204.10348, 2022
- Fang Wu and Stan Z Li “DIFFMD: A Geometric Diffusion Model for Molecular Dynamics Simulations” In Proceedings of the AAAI Conference on Artificial Intelligence 37.4, 2023, pp. 5321–5329
- “Two for one: Diffusion models and force fields for coarse-grained molecular dynamics” In Journal of Chemical Theory and Computation ACS Publications, 2023
- “Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems” In arXiv preprint arXiv:2307.08423, 2023
- “Schnet–a deep learning architecture for molecules and materials” In The Journal of Chemical Physics 148.24 AIP Publishing LLC, 2018, pp. 241722
- “Fast and uncertainty-aware directional message passing for non-equilibrium molecules” In arXiv preprint arXiv:2011.14115, 2020
- “Tensor Field Networks: Rotation-and translation-equivariant neural networks for 3d point clouds” In arXiv preprint arXiv:1802.08219, 2018
- “Learning Local Equivariant Representations for Large-Scale Atomistic Dynamics” In arXiv preprint arXiv:2204.05249, 2022
- “A group symmetric stochastic differential equation model for molecule multi-modal pretraining” In International Conference on Machine Learning, 2023, pp. 21497–21526 PMLR
- “Molecule Joint Auto-Encoding: Trajectory Pretraining with 2D and 3D Diffusion” In Thirty-seventh Conference on Neural Information Processing Systems, 2023 URL: https://openreview.net/forum?id=xzmaFfw6oh
- Marta M Stepniewska-Dziubinska, Piotr Zielenkiewicz and Pawel Siedlecki “Development and evaluation of a deep learning model for protein–ligand binding affinity prediction” In Bioinformatics 34.21 Oxford University Press, 2018, pp. 3666–3674
- “K deep: protein–ligand absolute binding affinity prediction via 3d-convolutional neural networks” In Journal of chemical information and modeling 58.2 ACS Publications, 2018, pp. 287–296
- “Improved protein–ligand binding affinity prediction with structure-based deep fusion inference” In Journal of chemical information and modeling 61.4 ACS Publications, 2021, pp. 1583–1592
- “Geometric Interaction Graph Neural Network for Predicting Protein–Ligand Binding Affinities from 3D Structures (GIGN)” In The Journal of Physical Chemistry Letters 14.8 ACS Publications, 2023, pp. 2020–2033
- “DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking” In International Conference on Learning Representations (ICLR), 2023
- “TorchMD: A deep learning framework for molecular simulations”, 2020 arXiv:2012.12106 [physics.chem-ph]
- “Bridging molecular docking to molecular dynamics in exploring ligand-protein recognition process: An overview” In Frontiers in pharmacology 9 Frontiers Media SA, 2018, pp. 923
- Elton P Hsu “Stochastic analysis on manifolds” American Mathematical Soc., 2002
- “Machine learning force fields” In Chemical Reviews 121.16 ACS Publications, 2021, pp. 10142–10186
- “How robust are modern graph neural network potentials in long and hot molecular dynamics simulations?” In Machine Learning: Science and Technology 3.4 IOP Publishing, 2022, pp. 045010
- “Neural ordinary differential equations” In Advances in neural information processing systems 31, 2018
- “The protein data bank” In Nucleic acids research 28.1 Oxford University Press, 2000, pp. 235–242
- “Score-based generative modeling through stochastic differential equations” In arXiv preprint arXiv:2011.13456, 2020
- “GROMACS: fast, flexible, and free” In Journal of computational chemistry 26.16 Wiley Online Library, 2005, pp. 1701–1718
- Victor Garcia Satorras, Emiel Hoogeboom and Max Welling “E(n) equivariant graph neural networks” In arXiv preprint arXiv:2102.09844, 2021
- Kristof T Schütt, Oliver T Unke and Michael Gastegger “Equivariant message passing for the prediction of tensorial properties and molecular spectra” In arXiv preprint arXiv:2102.03150, 2021
- “SE (3) equivariant graph neural networks with complete local frames” In International Conference on Machine Learning, 2022, pp. 5583–5608 PMLR
- “Continuous-Discrete Convolution for Geometry-Sequence Modeling in Proteins” In The Eleventh International Conference on Learning Representations, 2022
- “Generative models for graph-based protein design” In Advances in neural information processing systems 32, 2019
- “Pocket2mol: Efficient molecular sampling based on 3d protein pockets” In International Conference on Machine Learning, 2022, pp. 17644–17655 PMLR
- “3d equivariant diffusion for target-aware molecule generation and affinity prediction” In arXiv preprint arXiv:2303.03543, 2023
- J Ceriotti “More, and DE Manolopoulos” In Comput. Phys. Commun 185, 2014, pp. 1019
- Philipp Tholke and Gianni De Fabritiis “Equivariant Transformers for Neural Network based Molecular Potentials” In International Conference on Learning Representations, 2022
- “Learning local equivariant representations for large-scale atomistic dynamics” In Nature Communications 14.1 Nature Publishing Group UK London, 2023, pp. 579
- “LAMMPS-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales” In Computer Physics Communications 271 Elsevier, 2022, pp. 108171
- Joe G Greener and David T Jones “Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins” In PloS one 16.9 Public Library of Science San Francisco, CA USA, 2021, pp. e0256990
- “Plas-5k: Dataset of protein-ligand affinities from molecular dynamics for machine learning applications” In Scientific Data 9.1 Nature Publishing Group UK London, 2022, pp. 548
- “PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications”, 2023
- Dennis C Rapaport “The art of molecular dynamics simulation” Cambridge university press, 2004
- “Molecular dynamics and protein function” In Proceedings of the National Academy of Sciences 102.19 National Acad Sciences, 2005, pp. 6679–6685
- “A unified model of protein dynamics” In Proceedings of the National Academy of Sciences 106.13 National Acad Sciences, 2009, pp. 5129–5134
- “Understanding molecular simulation: from algorithms to applications” Academic Press San Diego, 2002
- Shengchao Liu (30 papers)
- Weitao Du (23 papers)
- Yanjing Li (26 papers)
- Zhuoxinran Li (5 papers)
- Vignesh Bhethanabotla (1 paper)
- Christian Borgs (48 papers)
- Anima Anandkumar (236 papers)
- Hongyu Guo (48 papers)
- Jennifer Chayes (28 papers)
- Hannan Xu (2 papers)
- Divin Yan (5 papers)