Papers
Topics
Authors
Recent
Search
2000 character limit reached

Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for 3D Molecule Generation

Published 27 Nov 2023 in cs.LG and q-bio.BM | (2311.16199v3)

Abstract: We present Symphony, an $E(3)$-equivariant autoregressive generative model for 3D molecular geometries that iteratively builds a molecule from molecular fragments. Existing autoregressive models such as G-SchNet and G-SphereNet for molecules utilize rotationally invariant features to respect the 3D symmetries of molecules. In contrast, Symphony uses message-passing with higher-degree $E(3)$-equivariant features. This allows a novel representation of probability distributions via spherical harmonic signals to efficiently model the 3D geometry of molecules. We show that Symphony is able to accurately generate small molecules from the QM9 dataset, outperforming existing autoregressive models and approaching the performance of diffusion models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Generative Models as an Emerging Paradigm in the Chemical Sciences. Journal of the American Chemical Society, 145(16):8736–8750, 2023. doi: 10.1021/jacs.2c13467. URL https://doi.org/10.1021/jacs.2c13467. PMID: 37052978.
  2. Open Babel: An open chemical toolbox, 2011. URL https://doi.org/10.1186/1758-2946-3-33.
  3. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. 13, May 2022. URL https://doi.org/10.1038/s41467-022-29939-5.
  4. JAX: Composable Transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
  5. PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences, 2023.
  6. Harmonic exponential families on manifolds, 2015.
  7. Understanding Convolutions on Graphs. Distill, 2021. doi: 10.23915/distill.00032. https://distill.pub/2021/understanding-gnns.
  8. Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/a4d8e2a7e0d0c102339f97716d2fdfb6-Paper.pdf.
  9. Inverse design of 3d molecular structures with conditional generative neural networks. 13, February 2022. URL https://doi.org/10.1038/s41467-022-28526-y.
  10. e3nn: Euclidean Neural Networks, 2022.
  11. A Kernel Two-Sample Test. Journal of Machine Learning Research, 13(25):723–773, 2012. URL http://jmlr.org/papers/v13/gretton12a.html.
  12. Equivariant Diffusion for Molecule Generation in 3D, 2022.
  13. Universal Structure Conversion Method for Organic Molecules: From Atomic Connectivity to Three-Dimensional Geometry. Bulletin of the Korean Chemical Society, 36(7):1769–1777, 2015. doi: https://doi.org/10.1002/bkcs.10334. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/bkcs.10334.
  14. Adam: A Method for Stochastic Optimization, 2017.
  15. rdkit/rdkit: 2023_03_2 (Q1 2023) Release, June 2023. URL https://doi.org/10.5281/zenodo.8053810.
  16. An Autoregressive Flow Model for 3D Molecular Geometry Generation from Scratch. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=C03Ajc-NS5W.
  17. Complexity of Many-Body Interactions in Transition Metals via Machine-Learned Force Fields from the TM23 Data Set, 2023.
  18. Incompleteness of graph neural networks for points clouds in three dimensions, 2022.
  19. Deep Learning for the Life Sciences. O’Reilly Media, 2019. https://www.amazon.com/Deep-Learning-Life-Sciences-Microscopy/dp/1492039837.
  20. UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. Journal of the American Chemical Society, 114(25):10024–10035, 12 1992. doi: 10.1021/ja00051a040. URL https://doi.org/10.1021/ja00051a040.
  21. Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation. Journal of Chemical Information and Modeling, 55(12):2562–2574, 2015. doi: 10.1021/acs.jcim.5b00654. URL https://doi.org/10.1021/acs.jcim.5b00654. PMID: 26575315.
  22. Fast and accurate modeling of molecular atomization energies with machine learning. Physical Review Letters, 108:058301, 2012.
  23. A Gentle Introduction to Graph Neural Networks. Distill, 2021. doi: 10.23915/distill.00033. https://distill.pub/2021/gnn-intro.
  24. E(n) Equivariant Normalizing Flows, 2022a.
  25. E(n) Equivariant Graph Neural Networks, 2022b.
  26. Schrödinger, LLC. The PyMOL Molecular Graphics System, Version 1.8. November 2015.
  27. SchNet: A continuous-filter convolutional neural network for modeling quantum interactions, 2017.
  28. Reinforcement learning for molecular design guided by quantum mechanics. In Hal Daumé III and Aarti Singh (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp.  8959–8969. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/simm20b.html.
  29. Symmetry-aware actor-critic for 3d molecular design. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=jEYKjPE1xYN.
  30. Martin Uhrin. Through the eyes of a descriptor: Constructing complete, invertible descriptions of atomic environments. Phys. Rev. B, 104:144110, Oct 2021. doi: 10.1103/PhysRevB.104.144110. URL https://link.aps.org/doi/10.1103/PhysRevB.104.144110.
  31. MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation, 2023.
  32. David Weininger. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28(1):31–36, 02 1988. doi: 10.1021/ci00057a005. URL https://doi.org/10.1021/ci00057a005.
Citations (4)

Summary

  • The paper introduces Symphony, a novel E(3)-equivariant model that builds 3D molecular geometries using spherical harmonic projections.
  • It employs a message-passing approach to predict atomic positions around a focus atom, achieving high chemical validity on the QM9 dataset.
  • The model advances autoregressive frameworks by efficiently capturing molecular symmetries, with significant implications for materials design and drug discovery.

Overview of "Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for Molecule Generation"

The paper presents Symphony, an innovative E(3)E(3)-equivariant autoregressive generative model tailored for the generation of three-dimensional (3D) molecular geometries. This model builds molecules iteratively by piecing together molecular fragments, adopting higher-degree E(3)E(3)-equivariant features through spherical harmonic projections. The authors position Symphony as an advancement over existing autoregressive models like G-SchNet and G-SphereNet by efficiently modeling 3D molecular structures and respecting molecular symmetries.

Method and Innovation

Symphony employs a message-passing approach utilizing high-degree equivariant features that capture the probabilistic nature of spherical harmonic signals. Actively maintaining the inherent E(3)E(3) symmetries, Symphony operates by selecting a single 'focus' atom and anticipating the types and positions of atoms around it based on the learned probability distributions. This marks a departure from standard approaches that rely on rotationally invariant features, making Symphony capable of more detailed and accurate predictions.

A significant methodological advancement is the projection of probability distributions onto spherical harmonics, which captures distributions radially and angularly on the atomic sphere. The model leverages this by predicting spherical harmonic coefficients, providing a compact and expressive way to describe probable atomic positions.

Numerical Results and Claims

The paper presents strong evidence for Symphony's superior performance over other autoregressive models in generating molecules from the QM9 dataset, even nearing the capabilities of diffusion models. Notably, Symphony achieves a high validity rate in constructing chemically plausible molecules, surpassing previous autoregressive frameworks. It introduces a bispectrum-based metric to evaluate angular accuracy, reinforcing the model's adeptness in matching training set statistics and generating stable structures.

Practical and Theoretical Implications

Practically, Symphony's contributions could be pivotal in domains like materials design and drug discovery, where efficient and accurate prediction of complex molecular geometries is essential. By adequately capturing fragment symmetries and generating stable molecular environments, Symphony could alleviate computational bottlenecks associated with quantum mechanical simulations.

Theoretically, this paper extends the utility of spherical harmonics and higher-order equivariant features in autoregressive modeling. It challenges the traditional reliance on rotational invariants and encourages further exploration into models that capture a wider breadth of symmetrical properties inherent in scientific phenomena.

Speculation on Future Developments

Symphony's framework could inspire future autoregressive models to incorporate multiscale representations or better radial resolution through advanced normalizing flows. Future efforts might also aim to optimize computational efficiency, given the noted inference time lag due to the tensor products required for higher-degree features. Additionally, integration with reinforcement learning techniques could further refine the generative process, allowing models like Symphony to adaptively learn optimal generative pathways for complex systems.

In summary, the introduction of Symphony illustrates the potential of utilizing symmetry-equivariant neural networks for sophisticated molecular generation tasks, opening new avenues for research in both machine learning and chemical sciences.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 214 likes about this paper.