Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling and Zero-Shot Transfer (2410.21683v1)

Published 29 Oct 2024 in cs.LG and physics.chem-ph

Abstract: Constructing transferable descriptors for conformation representation of molecular and biological systems finds numerous applications in drug discovery, learning-based molecular dynamics, and protein mechanism analysis. Geometric graph neural networks (Geom-GNNs) with all-atom information have transformed atomistic simulations by serving as a general learnable geometric descriptors for downstream tasks including prediction of interatomic potential and molecular properties. However, common practices involve supervising Geom-GNNs on specific downstream tasks, which suffer from the lack of high-quality data and inaccurate labels leading to poor generalization and performance degradation on out-of-distribution (OOD) scenarios. In this work, we explored the possibility of using pre-trained Geom-GNNs as transferable and highly effective geometric descriptors for improved generalization. To explore their representation power, we studied the scaling behaviors of Geom-GNNs under self-supervised pre-training, supervised and unsupervised learning setups. We find that the expressive power of different architectures can differ on the pre-training task. Interestingly, Geom-GNNs do not follow the power-law scaling on the pre-training task, and universally lack predictable scaling behavior on the supervised tasks with quantum chemical labels important for screening and design of novel molecules. More importantly, we demonstrate how all-atom graph embedding can be organically combined with other neural architectures to enhance the expressive power. Meanwhile, the low-dimensional projection of the latent space shows excellent agreement with conventional geometrical descriptors.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (79)
  1. Toward reliable density functional methods without adjustable parameters: The pbe0 model. The Journal of chemical physics, 110(13):6158–6170, 1999.
  2. Computer aided drug design and its application to the development of potential drugs for neurodegenerative disorders. Current neuropharmacology, 16(6):740–748, 2018.
  3. Mace: Higher order equivariant message passing neural networks for fast and accurate force fields. Advances in Neural Information Processing Systems, 35:11423–11436, 2022.
  4. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 13(1):2453, 2022.
  5. Deep learning the slow modes for rare events sampling. Proceedings of the National Academy of Sciences, 118(44):e2113533118, 2021.
  6. Atomistic folding simulations of the five-helix bundle protein λ𝜆\lambdaitalic_λ6- 85. Journal of the American Chemical Society, 133(4):664–667, 2011.
  7. Protein data bank (pdb): the single global macromolecular structure archive. Protein crystallography: methods and protocols, pp.  627–641, 2017.
  8. Uncovering neural scaling laws in molecular representation learning. Advances in Neural Information Processing Systems, 36, 2024.
  9. Machine learning of accurate energy-conserving molecular force fields. Science advances, 3(5):e1603015, 2017.
  10. Orbnet denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and dft accuracy. The Journal of Chemical Physics, 155(20), 2021.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019. URL https://arxiv.org/abs/1810.04805.
  12. Self-consistent molecular-orbital methods. ix. an extended gaussian-type basis for molecular-orbital studies of organic molecules. The Journal of Chemical Physics, 54(2):724–728, 1971.
  13. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Scientific data, 8(1):55, 2021.
  14. Thom H Dunning Jr. Gaussian basis sets for use in correlated molecular calculations. i. the atoms boron through neon and hydrogen. The Journal of chemical physics, 90(2):1007–1023, 1989.
  15. Neural scaling of deep chemical models. Nature Machine Intelligence, 5(11):1297–1305, 2023.
  16. Directional message passing for molecular graphs. arXiv preprint arXiv:2003.03123, 2020.
  17. A thorough benchmark of density functional methods for general main group thermochemistry, kinetics, and noncovalent interactions. Physical Chemistry Chemical Physics, 13(14):6670–6688, 2011.
  18. Mn15: A kohn–sham global-hybrid exchange–correlation density functional with broad accuracy for multi-reference and single-reference systems and noncovalent interactions. Chemical science, 7(8):5032–5051, 2016.
  19. Intrinsic-extrinsic convolution and pooling for learning on 3d protein structures. International Conference on Learning Representations, 2021.
  20. Scaling laws for transfer. arXiv preprint arXiv:2102.01293, 2021.
  21. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  22. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022.
  23. Deepsf: deep convolutional neural network for mapping protein sequences to folds. Bioinformatics, 34(8):1295–1303, 2018.
  24. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
  25. Revgraphvamp: A protein molecular simulation analysis model combining graph convolutional neural networks and physical constraints. bioRxiv, pp.  2024–03, 2024.
  26. Machine-guided path sampling to discover mechanisms of molecular self-organization. Nature Computational Science, 3(4):334–345, 2023.
  27. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
  28. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  4015–4026, 2023.
  29. Timewarp: Transferable acceleration of molecular dynamics by learning time-coarsened dynamics. Advances in Neural Information Processing Systems, 36, 2024.
  30. Greg Landrum et al. Rdkit: A software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum, 8(31.10):5281, 2013.
  31. Development of the colle-salvetti correlation-energy formula into a functional of the electron density. Physical review B, 37(2):785, 1988.
  32. How fast-folding proteins fold. Science, 334(6055):517–520, 2011.
  33. Graphvampnets for uncovering slow collective variables of self-assembly dynamics. The Journal of Chemical Physics, 159(9), 2023.
  34. Neural scaling laws on graphs. arXiv preprint arXiv:2402.02054, 2024.
  35. Molecular geometry pretraining with se (3)-invariant denoising distance matching. arXiv preprint arXiv:2206.13602, 2022.
  36. Pdb-wide collection of binding data: current status of the pdbbind database. Bioinformatics, 31(3):405–412, 2015.
  37. Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals. Molecular physics, 115(19):2315–2372, 2017.
  38. Vampnets for deep learning of molecular kinetics. Nature communications, 9(1):5, 2018.
  39. Molecular descriptors. In Handbook of computational chemistry, pp.  2065–2093. Springer International Publishing, 2017.
  40. Transition path theory for markov jump processes. Multiscale Modeling & Simulation, 7(3):1192–1219, 2009.
  41. Separation of a mixture of independent signals using time delayed correlations. Physical review letters, 72(23):3634, 1994.
  42. Mordred: a molecular descriptor calculator. Journal of cheminformatics, 10:1–14, 2018.
  43. Scaling data-constrained language models. Advances in Neural Information Processing Systems, 36, 2024.
  44. Learning local equivariant representations for large-scale atomistic dynamics. Nature Communications, 14(1):579, 2023.
  45. Pubchemqc project: a large-scale first-principles electronic structure database for data-driven chemistry. Journal of chemical information and modeling, 57(6):1300–1308, 2017.
  46. Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: the case of domain motions. The Journal of chemical physics, 134(6), 2011.
  47. Sliced denoising: A physics-informed molecular pre-training method. arXiv preprint arXiv:2311.02124, 2023.
  48. Markov state models from short non-equilibrium simulations—analysis and correction of estimation bias. The Journal of Chemical Physics, 146(9), 2017.
  49. Atomic-level characterization of protein–protein association. Proceedings of the National Academy of Sciences, 116(10):4244–4249, 2019.
  50. Beyond md17: the reactive xxmd dataset. Scientific Data, 11(1):222, 2024a.
  51. geom2vec: pretrained gnns as geometric featurizers for conformational dynamics. arXiv preprint arXiv:2409.19838, 2024b. URL http://arxiv.org/abs/2409.19838.
  52. Quest for a universal density functional: the accuracy of density functionals across a broad spectrum of databases in chemistry and physics. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 372(2011):20120476, 2014.
  53. Learning transferable visual models from natural language supervision. In International conference on machine learning, pp.  8748–8763. PMLR, 2021.
  54. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014.
  55. Pyemma 2: A software package for estimation, validation, and analysis of markov models. Journal of chemical theory and computation, 11(11):5525–5542, 2015.
  56. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning, pp.  9377–9388. PMLR, 2021.
  57. Schnet–a deep learning architecture for molecules and materials. The Journal of Chemical Physics, 148(24), 2018.
  58. Small representative databases for testing and validating density functionals and other electronic structure methods. The Journal of Physical Chemistry A, 2024.
  59. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp.  2256–2265. PMLR, 2015.
  60. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  61. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  62. Predicting rare events using neural networks and short-trajectory data. Journal of computational physics, 488:112152, 2023a.
  63. Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction. The Journal of Chemical Physics, 159(1), 2023b.
  64. Recent developments in the pyscf program package. The Journal of chemical physics, 153(2), 2020.
  65. Torchmd-net: equivariant transformers for neural network based molecular potentials. arXiv preprint arXiv:2202.02541, 2022.
  66. Handbook of molecular descriptors. John Wiley & Sons, 2008.
  67. Deep image prior. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  9446–9454, 2018.
  68. Atlas: protein flexibility description from atomistic molecular dynamics simulations. Nucleic acids research, 52(D1):D384–D392, 2024.
  69. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  70. Learning hierarchical protein representations via complete 3d graph networks. arXiv preprint arXiv:2207.12600, 2022.
  71. Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing. Nature Communications, 15(1):313, 2024.
  72. Denoise pretraining on nonequilibrium molecules for accurate and transferable neural potentials. Journal of Chemical Theory and Computation, 19(15):5077–5087, 2023.
  73. Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics. The Journal of chemical physics, 148(24), 2018.
  74. Florian Weigend. Accurate coulomb-fitting basis sets for h to rn. Physical chemistry chemical physics, 8(9):1057–1065, 2006.
  75. Variational approach for learning markov processes from time series data. Journal of Nonlinear Science, 30(1):23–66, 2020.
  76. Pre-training via denoising for molecular property prediction. arXiv preprint arXiv:2206.00133, 2022.
  77. Comparison of dft methods for molecular orbital eigenvalue calculations. The journal of physical chemistry A, 111(8):1554–1561, 2007.
  78. Benchmark databases for nonbonded interactions and their use to test density functional theory. Journal of Chemical Theory and Computation, 1(3):415–432, 2005.
  79. The m06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: two new functionals and systematic testing of four m06-class functionals and 12 other functionals. Theoretical chemistry accounts, 120:215–241, 2008.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets