Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient interpolation of molecular properties across chemical compound space with low-dimensional descriptors (2311.15207v1)

Published 26 Nov 2023 in physics.chem-ph and cs.LG

Abstract: We demonstrate accurate data-starved models of molecular properties for interpolation in chemical compound spaces with low-dimensional descriptors. Our starting point is based on three-dimensional, universal, physical descriptors derived from the properties of the distributions of the eigenvalues of Coulomb matrices. To account for the shape and composition of molecules, we combine these descriptors with six-dimensional features informed by the Gershgorin circle theorem. We use the nine-dimensional descriptors thus obtained for Gaussian process regression based on kernels with variable functional form, leading to extremely efficient, low-dimensional interpolation models. The resulting models trained with 100 molecules are able to predict the product of entropy and temperature ($S \times T$) and zero point vibrational energy (ZPVE) with the absolute error under 1 kcal mol${-1}$ for $> 78$ \% and under 1.3 kcal mol${-1}$ for $> 92$ \% of molecules in the test data. The test data comprises 20,000 molecules with complexity varying from three atoms to 29 atoms and the ranges of $S \times T$ and ZPVE covering 36 kcal mol${-1}$ and 161 kcal mol${-1}$, respectively. We also illustrate that the descriptors based on the Gershgorin circle theorem yield more accurate models of molecular entropy than those based on graph neural networks that explicitly account for the atomic connectivity of molecules.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Polishchuk P G, Madzhidov T I and Varnek A 2013 J. Comput. Aided Mol. Des. 27 675
  2. Weinreich J, Browning N J and von Lilienfeld O A 2021 J. Chem. Phys. 154 134113
  3. Gubaev K, Podryabinkin E V and Shapeev A V 2018 J. Chem. Phys. 148 241727
  4. Rostami S, Amsler M and Ghasemi S A 2018 J. Chem. Phys. 149 124106
  5. Willatt M J, Musil F and Ceriotti M 2018 Phys. Chem. Chem. Phys. 20 29661
  6. Langer M F, Goeßmann A and Rupp M 2022 npj Comput. Mater. 8 41
  7. Duvenaud D K, Nickisch H and Rasmussen C 2011 Adv. Neural Inf. Process. Syst. 24
  8. Bartók A P, Kondor R and Csányi G 2013 Phys. Rev. B 87 184115
  9. Kipf T N and Welling M 2016 arXiv preprint arXiv:1609.02907
  10. Shui Z and Karypis G 2020 IEEE Int. Conf. on Data Mining
  11. Williams C K and Rasmussen C E 2006 Gaussian processes for machine learning 3 (Cambridge, MA: MIT press)
  12. Gershgorin S A 1931 News of the Russian Academy of Sciences. Mathematical series 749
  13. Saad Y 2011 Numerical Methods for Large Eigenvalue Problems: Revised Edition Classics in Applied Mathematics (Philadelphia, PA: Society for Industrial and Applied Mathematics)
  14. Bellman R 1966 Science 153 34
  15. Schrier J 2020 J. Chem. Inf. Model. 60 3804
  16. Murphy K P 2018 Machine learning: A probabilistic perspective (adaptive computation and machine learning series) (London, UK: The MIT Press)
  17. Torabian E and Krems R V 2023 Phys. Rev. Res. 5 013211
  18. Vargas-Hernández R A and Krems R V 2020 Mach. Learn. Meets Quantum Phys. 968 171
  19. Asnaashari K and Krems R V 2021 Mach. Learn.: Sci. Technol. 3 015005
  20. Dai J and Krems R V 2023 Mach. Learn.: Sci. Technol. 4 045027
  21. Dai J and Krems R V 2020 J. Chem. Theory Comput. 16 1386
  22. Huyskens P, Vandevijvere P and Siegel G 1989 J. Mol. Struct. 200 555
  23. Khan D, Heinen S and von Lilienfeld O A 2023 J. Chem. Phys. 159 034106
Citations (1)

Summary

We haven't generated a summary for this paper yet.