Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gradual Optimization Learning for Conformational Energy Minimization (2311.06295v2)

Published 5 Nov 2023 in physics.chem-ph and cs.LG

Abstract: Molecular conformation optimization is crucial to computer-aided drug discovery and materials design. Traditional energy minimization techniques rely on iterative optimization methods that use molecular forces calculated by a physical simulator (oracle) as anti-gradients. However, this is a computationally expensive approach that requires many interactions with a physical simulator. One way to accelerate this procedure is to replace the physical simulator with a neural network. Despite recent progress in neural networks for molecular conformation energy prediction, such models are prone to distribution shift, leading to inaccurate energy minimization. We find that the quality of energy minimization with neural networks can be improved by providing optimization trajectories as additional training data. Still, it takes around $5 \times 105$ additional conformations to match the physical simulator's optimization quality. In this work, we present the Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks that significantly reduces the required additional data. The framework consists of an efficient data-collecting scheme and an external optimizer. The external optimizer utilizes gradients from the energy prediction model to generate optimization trajectories, and the data-collecting scheme selects additional training data to be processed by the physical simulator. Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules using $50$x less additional data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Geom, energy-annotated molecular conformations for property prediction and molecular generation. Scientific Data, 9(1):185, 2022.
  2. Clustering of chemical structures on the basis of two-dimensional similarity measures. Journal of Chemical Information and Computer Sciences, 32(6):644–649, 1992. doi: 10.1021/ci00010a010. URL https://doi.org/10.1021/ci00010a010.
  3. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 13(1):1–11, 2022.
  4. Bayesian optimization for conformer generation. J. Cheminform., 11(1):32, May 2019.
  5. Machine learning of accurate energy-conserving molecular force fields. Science advances, 3(5):e1603015, 2017.
  6. Towards exact molecular dynamics simulations with machine-learned force fields. Nature Communications, 9(1):3887, 2018. doi: 10.1038/s41467-018-06169-2.
  7. Accurate molecular dynamics enabled by efficient physically-constrained machine learning approaches, pp.  129–154. Springer International Publishing, 2020. doi: 10.1007/978-3-030-40245-7_7.
  8. Accurate global machine learning force fields for molecules with hundreds of atoms. Science Advances, 9(2):eadf0873, 2023. doi: 10.1126/sciadv.adf0873.
  9. Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Scientific Data, 10(1):11, 2023.
  10. Ec-conf: A ultra-fast diffusion model for molecular conformation generation with equivariant consistency. arXiv preprint arXiv:2308.00237, 2023.
  11. Geomol: Torsional geometric generation of molecular 3d conformer ensembles. Advances in Neural Information Processing Systems, 34:13757–13769, 2021.
  12. Directional message passing for molecular graphs. arXiv preprint arXiv:2003.03123, 2020.
  13. Gemnet: Universal directional graph neural networks for molecules. Advances in Neural Information Processing Systems, 34:6790–6802, 2021.
  14. Neural message passing for quantum chemistry. In Doina Precup and Yee Whye Teh (eds.), Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp.  1263–1272. PMLR, 2017.
  15. Energy-inspired molecular conformation optimization. In international conference on learning representations, 2021.
  16. Thomas A. Halgren. Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of Computational Chemistry, 17(5-6):490–519, 1996. doi: https://doi.org/10.1002/(SICI)1096-987X(199604)17:5/6¡490::AID-JCC1¿3.0.CO;2-P.
  17. A priori calculation of molecular properties to chemical accuracy. Journal of Physical Organic Chemistry, 17(11):913–933, 2004.
  18. Mdm: Molecular diffusion model for 3d molecule generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  5105–5112, 2023.
  19. Qmugs, quantum mechanical properties of drug-like molecules. Scientific Data, 9(1):273, 2022.
  20. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL materials, 1(1):011002, 2013.
  21. Torsional diffusion for molecular conformer generation. Advances in Neural Information Processing Systems, 35:24240–24253, 2022.
  22. nabladft: Large-scale conformational energy and hamiltonian prediction benchmark and dataset. Phys. Chem. Chem. Phys., 24:25853–25863, 2022. doi: 10.1039/D2CP03966D. URL http://dx.doi.org/10.1039/D2CP03966D.
  23. Pubchem 2023 update. Nucleic acids research, 51(D1):D1373–D1380, 2023.
  24. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  25. Self-consistent equations including exchange and correlation effects. Physical review, 140(4A):A1133, 1965.
  26. Uncertainty-driven dynamics for active learning of interatomic potentials. Nature Computational Science, 3(3):230–239, March 2023.
  27. rdkit/rdkit: 2022_03_1 (q1 2022) release, March 2022. URL https://doi.org/10.5281/zenodo.6388425.
  28. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1-3):503–528, 1989.
  29. Highly accurate quantum chemical property prediction with uni-mol+. arXiv preprint arXiv:2303.16982, 2023.
  30. Predicting molecular conformation via dynamic graph score matching. Advances in Neural Information Processing Systems, 34:19784–19795, 2021.
  31. The Quantum Theory of Atoms in Molecules: From Solid State to DNA and Drug Design. John Wiley & Sons, April 2007.
  32. Learning local equivariant representations for large-scale atomistic dynamics. arXiv preprint arXiv:2204.05249, 2022.
  33. Pubchemqc b3lyp/6-31g*//pm6 data set: The electronic structures of 86 million molecules using b3lyp/6-31g* calculations. Journal of Chemical Information and Modeling, 63(18):5734–5754, 2023. doi: 10.1021/acs.jcim.3c00899. URL https://doi.org/10.1021/acs.jcim.3c00899. PMID: 37677147.
  34. Active search in intensionally specified structured spaces. AAAI, 31(1), February 2017.
  35. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014.
  36. 3dmol. js: molecular visualization with webgl. Bioinformatics, 31(8):1322–1324, 2015.
  37. A stochastic approximation method. The annals of mathematical statistics, pp.  400–407, 1951.
  38. Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17. Journal of Chemical Information and Modeling, 52(11):2864–2875, 2012. doi: 10.1021/ci300415d. URL https://doi.org/10.1021/ci300415d. PMID: 23088335.
  39. Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in neural information processing systems, 30:992–1002, 2017.
  40. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning, pp.  9377–9388. PMLR, 2021.
  41. SchNetPack 2.0: A neural network toolbox for atomistic machine learning. The Journal of Chemical Physics, 158(14):144801, 04 2023. ISSN 0021-9606. doi: 10.1063/5.0138367. URL https://doi.org/10.1063/5.0138367.
  42. Learning gradient fields for molecular conformation generation. In International conference on machine learning, pp.  9558–9568. PMLR, 2021.
  43. Rotation invariant graph neural networks using spin convolutions. ArXiv, abs/2106.09575, 2021.
  44. Gregor N. C. Simm and José Miguel Hernández-Lobato. A generative model for molecular distance geometry. In International Conference on Machine Learning, 2019. URL https://api.semanticscholar.org/CorpusID:202749839.
  45. Psi4 1.4: Open-source software for high-throughput quantum chemistry. The Journal of chemical physics, 152(18), 2020.
  46. Deep unsupervised learning using nonequilibrium thermodynamics. In Francis Bach and David Blei (eds.), Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pp.  2256–2265, Lille, France, 2015. PMLR.
  47. General performance of density functionals. The Journal of Physical Chemistry A, 111(42):10439–10452, 2007.
  48. Von mises mixture distributions for molecular conformation generation. In International Conference on Machine Learning, pp.  33319–33342. PMLR, 2023.
  49. Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges? J. Comput. Aided Mol. Des., 35(4):453–472, April 2021.
  50. The open catalyst 2022 (oc22) dataset and challenges for oxide electrocatalysis. arXiv preprint arXiv:2206.08917, 2022.
  51. CJ Tsai and KD Jordan. Use of an eigenmode method to locate the stationary points on the potential energy surfaces of selected argon and water clusters. The Journal of Physical Chemistry, 97(43):11227–11237, 1993.
  52. Machine learning force fields. Chemical Reviews, 121(16):10142–10186, 2021. doi: 10.1021/acs.chemrev.0c01111. URL https://doi.org/10.1021/acs.chemrev.0c01111. PMID: 33705118.
  53. Regularized molecular conformation fields. Advances in Neural Information Processing Systems, 35:18929–18941, 2022.
  54. Improving conformer generation for small rings and macrocycles based on distance geometry and experimental torsional-angle preferences. Journal of Chemical Information and Modeling, 60(4):2044–2058, 2020. doi: 10.1021/acs.jcim.0c00025. URL https://doi.org/10.1021/acs.jcim.0c00025. PMID: 32155061.
  55. Diffusion-based molecule generation with informative prior bridges. Advances in Neural Information Processing Systems, 35:36533–36545, 2022.
  56. Learning neural generative dynamics for molecular conformation generation. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=pAbm1qfheGk.
  57. Geodiff: A geometric diffusion model for molecular conformation generation. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=PzcvxEMzvQC.
  58. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, 34:28877–28888, 2021.
  59. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Physical review letters, 120(14):143001, 2018.
  60. Direct molecular conformation generation. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=lCPOHiztuw.
Citations (2)

Summary

We haven't generated a summary for this paper yet.