Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MUDiff: Unified Diffusion for Complete Molecule Generation (2304.14621v3)

Published 28 Apr 2023 in cs.LG and q-bio.BM

Abstract: Molecule generation is a very important practical problem, with uses in drug discovery and material design, and AI methods promise to provide useful solutions. However, existing methods for molecule generation focus either on 2D graph structure or on 3D geometric structure, which is not sufficient to represent a complete molecule as 2D graph captures mainly topology while 3D geometry captures mainly spatial atom arrangements. Combining these representations is essential to better represent a molecule. In this paper, we present a new model for generating a comprehensive representation of molecules, including atom features, 2D discrete molecule structures, and 3D continuous molecule coordinates, by combining discrete and continuous diffusion processes. The use of diffusion processes allows for capturing the probabilistic nature of molecular processes and exploring the effect of different factors on molecular structures. Additionally, we propose a novel graph transformer architecture to denoise the diffusion process. The transformer adheres to 3D roto-translation equivariance constraints, allowing it to learn invariant atom and edge representations while preserving the equivariance of atom coordinates. This transformer can be used to learn molecular representations robust to geometric transformations. We evaluate the performance of our model through experiments and comparisons with existing methods, showing its ability to generate more stable and valid molecules. Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016a.
  2. E (n) equivariant normalizing flows. arXiv preprint arXiv:2105.09016, 2021a.
  3. Geodiff: A geometric diffusion model for molecular conformation generation. arXiv preprint arXiv:2203.02923, 2022.
  4. Equivariant diffusion for molecule generation in 3d. In International Conference on Machine Learning, pages 8867–8887. PMLR, 2022.
  5. Score-based generative modeling of graphs via the system of stochastic differential equations. arXiv preprint arXiv:2202.02514, 2022.
  6. Digress: Discrete denoising diffusion for graph generation. arXiv preprint arXiv:2209.14734, 2022.
  7. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pages 2224–2232, 2015.
  8. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1263–1272. JMLR. org, 2017.
  9. Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in neural information processing systems, 30, 2017.
  10. Predicting binding free energies: frontiers and benchmarks. Annual review of biophysics, 46:531–558, 2017.
  11. Molecular complexity and fragment-based drug discovery: ten years on. Current opinion in chemical biology, 15(4):489–496, 2011.
  12. Robert A Copeland. Evaluation of enzyme inhibitors in drug discovery: a guide for medicinal chemists and pharmacologists. John Wiley & Sons, 2013.
  13. Chembl: a large-scale bioactivity database for drug discovery. Nucleic acids research, 40(D1):D1100–D1107, 2012.
  14. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  15. Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
  16. Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34:17981–17993, 2021.
  17. Physnet: A neural network for predicting energies, forces, dipole moments, and partial charges. Journal of chemical theory and computation, 15(6):3678–3693, 2019.
  18. Equivariant flows: exact likelihood generative learning for symmetric densities. In International conference on machine learning, pages 5361–5370. PMLR, 2020.
  19. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, 34:28877–28888, 2021.
  20. Directional graph networks. In International Conference on Machine Learning, pages 748–758. PMLR, 2021.
  21. Torchmd-net: Equivariant transformers for neural network based molecular potentials. arXiv preprint arXiv:2202.02541, 2022.
  22. Semi-supervised classification with graph convolutional networks. arXiv, abs/1609.02907, 2016b. URL http://arxiv.org/abs/1609.02907.
  23. Inductive representation learning on large graphs. arXiv, abs/1706.02216, 2017. URL http://arxiv.org/abs/1706.02216.
  24. Break the ceiling: Stronger multi-scale deep graph convolutional networks. Advances in neural information processing systems, 32, 2019.
  25. Is heterophily a real nightmare for graph neural networks to do node classification? arXiv preprint arXiv:2109.05641, 2021.
  26. Revisiting heterophily for graph neural networks. Advances in neural information processing systems, 35:1362–1375, 2022.
  27. High-order pooling for graph neural networks with tensor decomposition. arXiv preprint arXiv:2205.11691, 2022.
  28. When do graph neural networks help with node classification: Investigating the homophily principle on node distinguishability. Advances in Neural Information Processing Systems, 36, 2023.
  29. Can graph neural networks count substructures? Advances in neural information processing systems, 33:10383–10395, 2020.
  30. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699, 2020.
  31. E (n) equivariant graph neural networks. In International conference on machine learning, pages 9323–9332. PMLR, 2021b.
  32. An equivariant generative framework for molecular graph-structure co-design. Chemical Science, 14(31):8380–8392, 2023.
  33. Moldiff: Addressing the atom-bond inconsistency problem in 3d molecule diffusion generation. arXiv preprint arXiv:2305.07508, 2023.
  34. Molecule generation for target protein binding with structural motifs. In The Eleventh International Conference on Learning Representations, 2022.
  35. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014.
  36. Cormorant: Covariant molecular neural networks. Advances in neural information processing systems, 32, 2019.
  37. Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules. Advances in neural information processing systems, 32, 2019.
  38. Top-n: Equivariant set and graph generation without exchangeability. arXiv preprint arXiv:2110.02096, 2021.
  39. Mdm: Molecular diffusion model for 3d molecule generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 5105–5112, 2023.
  40. Geometric latent diffusion models for 3d molecule generation. In International Conference on Machine Learning, pages 38592–38610. PMLR, 2023.
  41. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning, pages 9377–9388. PMLR, 2021.
  42. One transformer can understand both 2d & 3d molecular data. arXiv preprint arXiv:2210.01765, 2022.
  43. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Chenqing Hua (18 papers)
  2. Sitao Luan (25 papers)
  3. Minkai Xu (40 papers)
  4. Rex Ying (90 papers)
  5. Jie Fu (229 papers)
  6. Stefano Ermon (279 papers)
  7. Doina Precup (206 papers)
Citations (20)
X Twitter Logo Streamline Icon: https://streamlinehq.com