Crystal Structure Prediction by Joint Equivariant Diffusion (2309.04475v2)
Abstract: Crystal Structure Prediction (CSP) is crucial in various scientific disciplines. While CSP can be addressed by employing currently-prevailing generative models (e.g. diffusion models), this task encounters unique challenges owing to the symmetric geometry of crystal structures -- the invariance of translation, rotation, and periodicity. To incorporate the above symmetries, this paper proposes DiffCSP, a novel diffusion model to learn the structure distribution from stable crystals. To be specific, DiffCSP jointly generates the lattice and atom coordinates for each crystal by employing a periodic-E(3)-equivariant denoising model, to better model the crystal geometry. Notably, different from related equivariant generative approaches, DiffCSP leverages fractional coordinates other than Cartesian coordinates to represent crystals, remarkably promoting the diffusion and the generation process of atom positions. Extensive experiments verify that our DiffCSP significantly outperforms existing CSP methods, with a much lower computation cost in contrast to DFT-based methods. Moreover, the superiority of DiffCSP is also observed when it is extended for ab initio crystal generation.
- Gautam R Desiraju. Cryptic crystallography. Nature materials, 1(2):77–79, 2002.
- Machine learning for molecular and materials science. Nature, 559(7715):547–555, 2018.
- Self-consistent equations including exchange and correlation effects. Physical review, 140(4A):A1133, 1965.
- Chris J Pickard and RJ Needs. Ab initio random structure searching. Journal of Physics: Condensed Matter, 23(5):053201, 2011.
- Crystal structure prediction accelerated by bayesian optimization. Physical Review Materials, 2(1):013803, 2018.
- Structure prediction drives materials discovery. Nature Reviews Materials, 4(5):331–348, 2019.
- 3-d inorganic crystal structure generation and property prediction via representation learning. Journal of chemical information and modeling, 60(10):4518–4535, 2020.
- Crystal structure prediction of materials with high symmetry using differential evolution. Journal of Physics: Condensed Matter, 33(45):455902, 2021.
- Crystal diffusion variational autoencoder for periodic material generation. In International Conference on Learning Representations, 2021.
- Geodiff: A geometric diffusion model for molecular conformation generation. In International Conference on Learning Representations, 2021.
- Diffusion probabilistic modeling of protein backbones in 3d for the motif-scaffolding problem. In The Eleventh International Conference on Learning Representations, 2023.
- Diffdock: Diffusion steps, twists, and turns for molecular docking. In The Eleventh International Conference on Learning Representations, 2023.
- Learning gradient fields for molecular conformation generation. In International Conference on Machine Learning, pages 9558–9568. PMLR, 2021.
- Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873):583–589, 2021. doi: 10.1038/s41586-021-03819-2.
- Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett., 120:145301, Apr 2018. doi: 10.1103/PhysRevLett.120.145301.
- Torsional diffusion for molecular conformer generation. arXiv preprint arXiv:2206.01729, 2022.
- Crystal structure prediction via particle-swarm optimization. Physical Review B, 82(9):094116, 2010.
- Computer-assisted inverse design of inorganic electrides. Physical Review X, 7(1):011017, 2017.
- On-the-fly machine learning of atomic potential in density functional theory structure optimization. Physical review letters, 120(2):026102, 2018.
- Accelerating crystal structure prediction by machine-learning interatomic potentials with active learning. Physical Review B, 99(6):064114, 2019.
- Crystal structure prediction by combining graph network and optimization algorithm. Nature communications, 13(1):1–8, 2022.
- Data-driven approach to encoding and decoding 3-d crystal structures. arXiv preprint arXiv:1909.00949, 2019.
- Inverse design of solid-state materials via a continuous representation. Matter, 1(5):1370–1384, 2019.
- Distance matrix-based crystal structure prediction using evolutionary algorithms. The Journal of Physical Chemistry A, 124(51):10909–10919, 2020.
- Contact map based crystal structure prediction using global optimization. CrystEngComm, 23(8):1765–1776, 2021.
- Crystalgan: learning to discover crystallographic structures with generative adversarial networks. arXiv preprint arXiv:1810.11203, 2018.
- Generative adversarial networks for crystal structure prediction. ACS central science, 6(8):1412–1420, 2020.
- An invertible crystallographic representation for general inverse design of inorganic crystals with targeted properties. Matter, 2021. ISSN 2590-2385. doi: https://doi.org/10.1016/j.matt.2021.11.032.
- Schnet–a deep learning architecture for molecules and materials. The Journal of Chemical Physics, 148(24):241722, 2018.
- Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018.
- Se(3)-transformers: 3d roto-translation equivariant attention networks. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- E (n) equivariant graph neural networks. In International Conference on Machine Learning, pages 9323–9332. PMLR, 2021.
- Equivariant transformers for neural network based molecular potentials. In International Conference on Learning Representations, 2021.
- Open catalyst 2020 (oc20) dataset and community challenges. ACS Catalysis, 2021. doi: 10.1021/acscatal.0c04525.
- The open catalyst 2022 (oc22) dataset and challenges for oxide electrocatalysis. arXiv preprint arXiv:2206.08917, 2022.
- Periodic graph transformers for crystal material property prediction. In The 36th Annual Conference on Neural Information Processing Systems, 2022.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265. PMLR, 2015.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- High-resolution image synthesis with latent diffusion models, 2021.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- Equivariant diffusion for molecule generation in 3d. In International Conference on Machine Learning, pages 8867–8887. PMLR, 2022.
- Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Riemannian score-based generative modelling. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Riemannian diffusion models. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Crystal structure prediction by data mining. Journal of Molecular Structure, 647(1-3):17–39, 2003.
- Numerically stable algorithms for the computation of reduced unit cells. Acta Crystallographica Section A: Foundations of Crystallography, 60(1):1–6, 2004.
- Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016.
- Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
- Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Graph networks as a universal machine learning framework for molecules and crystals. Chemistry of Materials, 31(9):3564–3572, 2019.
- Inverse design of 3d molecular structures with conditional generative neural networks. Nature communications, 13(1):1–11, 2022.
- New cubic perovskites for one-and two-photon water splitting using the computational materials repository. Energy & Environmental Science, 5(10):9034–9043, 2012a.
- Computational screening of perovskite metal oxides for optimal solar light capture. Energy & Environmental Science, 5(2):5814–5819, 2012b.
- Chris J. Pickard. Airss data for carbon at 10gpa and the c+n+h+o system at 1gpa, 2020.
- Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL materials, 1(1):011002, 2013.
- Python materials genomics (pymatgen): A robust, open-source python library for materials analysis. Computational Materials Science, 68:314–319, 2013.
- Uspex—evolutionary crystal structure prediction. Computer physics communications, 175(11-12):713–720, 2006.
- Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 7566–7578. Curran Associates, Inc., 2019.
- Efficient evaluation of the probability density function of a wrapped normal distribution. In 2014 Sensor Data Fusion: Trends, Solutions, Applications (SDF), pages 1–5. IEEE, 2014.
- Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International conference on machine learning, pages 115–123. PMLR, 2013.
- Fast and uncertainty-aware directional message passing for non-equilibrium molecules. In Machine Learning for Molecules Workshop, NeurIPS, 2020.
- Gemnet: Universal directional graph neural networks for molecules. In Conference on Neural Information Processing Systems (NeurIPS), 2021.
- Local structure order parameters and site fingerprints for quantification of coordination environment and crystal structure similarity. RSC advances, 10(10):6063–6081, 2020.
- Peter E Blöchl. Projector augmented-wave method. Physical review B, 50(24):17953, 1994.
- Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Computational materials science, 6(1):15–50, 1996.
- Generalized gradient approximation made simple. Physical review letters, 77(18):3865, 1996.
- Argmax flows and multinomial diffusion: Learning categorical distributions. Advances in Neural Information Processing Systems, 34:12454–12465, 2021.
- Digress: Discrete denoising diffusion for graph generation. In The Eleventh International Conference on Learning Representations, 2023.
- Smact: Semiconducting materials by analogy and chemical theory. Journal of Open Source Software, 4(38):1361, 2019.
- A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials, 2(1):1–7, 2016.
- EGSDE: Unpaired image-to-image translation via energy-guided stochastic differential equations. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Equivariant energy-guided SDE for inverse molecular design. In The Eleventh International Conference on Learning Representations, 2023.