Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Recipe for Charge Density Prediction (2405.19276v1)

Published 29 May 2024 in physics.comp-ph and cs.LG

Abstract: In density functional theory, charge density is the core attribute of atomic systems from which all chemical properties can be derived. Machine learning methods are promising in significantly accelerating charge density prediction, yet existing approaches either lack accuracy or scalability. We propose a recipe that can achieve both. In particular, we identify three key ingredients: (1) representing the charge density with atomic and virtual orbitals (spherical fields centered at atom/virtual coordinates); (2) using expressive and learnable orbital basis sets (basis function for the spherical fields); and (3) using high-capacity equivariant neural network architecture. Our method achieves state-of-the-art accuracy while being more than an order of magnitude faster than existing methods. Furthermore, our method enables flexible efficiency-accuracy trade-offs by adjusting the model/basis sizes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Computing and rendering point set surfaces. IEEE Transactions on visualization and computer graphics, 9(1):3–15, 2003.
  2. Cormorant: Covariant molecular neural networks. Advances in neural information processing systems, 32, 2019.
  3. Even-tempered atomic orbitals. vi. optimal orbital exponents and optimal contractions of gaussian primitives for hydrogen, carbon, and oxygen in molecules. The Journal of Chemical Physics, 60(3):918–931, 1974.
  4. Mace: Higher order equivariant message passing neural networks for fast and accurate force fields. Advances in Neural Information Processing Systems, 35:11423–11436, 2022.
  5. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 13(1):2453, 2022.
  6. Lukas Biewald. Experiment tracking with weights and biases, 2020. URL https://www.wandb.com/. Software available from wandb.com.
  7. EGraFFBench: evaluation of equivariant graph neural network force fields for atomistic simulations. Digital Discovery, 3(4):759–768, 2024.
  8. Open catalyst 2020 (oc20) dataset and community challenges. Acs Catalysis, 11(10):6059–6072, 2021.
  9. Equivariant neural operator learning with graphon convolution. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=EjiA3uWpnc.
  10. Even-tempered slater-type orbitals revisited: From hydrogen to krypton. Journal of computational chemistry, 25(8):1030–1036, 2004.
  11. Symphony: Symmetry-equivariant point-centered spherical harmonics for molecule generation. arXiv preprint arXiv:2311.16199, 2023.
  12. A deep learning framework to emulate density functional theory. npj Computational Materials, 9(1):158, 2023.
  13. A hitchhiker’s guide to geometric GNNs for 3D atomic systems. arXiv preprint arXiv:2312.07511, 2023.
  14. Auxiliary basis sets to approximate coulomb potentials. Chemical physics letters, 240(4):283–290, 1995.
  15. Electron density learning of non-covalent systems. Chemical science, 10(41):9424–9432, 2019.
  16. Linear jacobi-legendre expansion of the charge density for machine learning-accelerated electronic structure calculations. npj Computational Materials, 9(1):87, 2023.
  17. Voronoi deformation density (VDD) charges: Assessment of the Mulliken, Bader, Hirshfeld, Weinhold, and VDD methods for charge analysis. Journal of computational chemistry, 25(2):189–210, 2004.
  18. Forces are not enough: Benchmark and critical evaluation for machine learning force fields with molecular simulations. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/forum?id=A8pqQipwkt. Survey Certification.
  19. Gemnet: Universal directional graph neural networks for molecules. Advances in Neural Information Processing Systems, 34:6790–6802, 2021.
  20. e3nn: Euclidean neural networks. arXiv preprint arXiv:2207.09453, 2022.
  21. Predicting charge density distribution of materials using a local-environment-based graph convolutional network. Physical Review B, 100(18):184103, 2019.
  22. Array programming with NumPy. Nature, 585(7825):357–362, September 2020. doi: 10.1038/s41586-020-2649-2. URL https://doi.org/10.1038/s41586-020-2649-2.
  23. Equivariant diffusion for molecule generation in 3D. In International conference on machine learning, pages 8867–8887. PMLR, 2022.
  24. J. D. Hunter. Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007. doi: 10.1109/MCSE.2007.55.
  25. Plotly Technologies Inc. Collaborative data science, 2015. URL https://plot.ly.
  26. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL materials, 1(1), 2013.
  27. Computational predictions of energy materials using density functional theory. Nature Reviews Materials, 1(1):1–13, 2016.
  28. Equivariant scalar fields for molecular docking with fast fourier transforms. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=BIveOmD1Nh.
  29. Equivariant graph neural networks for fast electron density estimation of molecules, liquids, and solids. npj Computational Materials, 8(1):183, 2022.
  30. Gaussian plane-wave neural operator for electron density estimation. arXiv preprint arXiv:2402.04278, 2024.
  31. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  32. Self-consistent equations including exchange and correlation effects. Physical review, 140(4A):A1133, 1965.
  33. Higher-order equivariant neural networks for charge density prediction in materials. arXiv preprint arXiv:2312.05388, 2023.
  34. The atomic simulation environment—a Python library for working with atoms. Journal of Physics: Condensed Matter, 29(27):273002, 2017.
  35. Equiformer: Equivariant graph attention transformer for 3D atomistic graphs. arXiv preprint arXiv:2206.11990, 2022.
  36. EquiformerV2: Improved equivariant transformer for scaling to higher-degree representations. arXiv preprint arXiv:2306.12059, 2023.
  37. Introducing DDEC6 atomic population analysis: part 4. efficient parallel computation of net atomic charges, atomic spin moments, bond orders, and more. RSC advances, 8(5):2678–2707, 2018.
  38. Scalable parallel programming with CUDA: Is CUDA the parallel programming model that application developers have been waiting for? Queue, 6(2):40–53, 2008.
  39. Python materials genomics (pymatgen): A robust, open-source python library for materials analysis. Computational Materials Science, 68:314–319, 2013.
  40. Reducing SO(3) convolutions to SO(2) for efficient equivariant GNNs. In International Conference on Machine Learning, pages 27420–27438. PMLR, 2023.
  41. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  42. Towards combinatorial generalization for catalysts: A kohn-sham charge-density approach. Advances in Neural Information Processing Systems, 36, 2023.
  43. New basis set exchange: An open, up-to-date resource for the molecular sciences community. Journal of chemical information and modeling, 59(11):4814–4820, 2019.
  44. Nearsightedness of electronic matter. Proceedings of the National Academy of Sciences, 102(33):11635–11638, 2005.
  45. Informing geometric deep learning with electronic interactions to accelerate quantum chemistry. Proceedings of the National Academy of Sciences, 119(31):e2205221119, 2022.
  46. A recipe for cracking the quantum scaling limit with machine learned electron densities. Machine Learning: Science and Technology, 4(1):015027, 2023.
  47. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014.
  48. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. Journal of chemical information and modeling, 52(11):2864–2875, 2012.
  49. Basis set exchange: a community database for computational sciences. Journal of chemical information and modeling, 47(3):1045–1052, 2007.
  50. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning, pages 9377–9388. PMLR, 2021.
  51. A representation-independent electronic charge density database for crystalline materials. Scientific data, 9(1):661, 2022.
  52. Topological graph-based analysis of solid-state ion migration. npj Computational Materials, 9(1):99, 2023.
  53. John C Slater. Atomic shielding constants. Physical review, 36(1):57, 1930.
  54. Recent developments in the PySCF program package. The Journal of chemical physics, 153(2), 2020.
  55. Chemical properties from graph neural network-predicted electron densities. The Journal of Physical Chemistry C, 127(48):23459–23466, 2023.
  56. Tensor field networks: Rotation-and translation-equivariant neural networks for 3D point clouds. arXiv preprint arXiv:1802.08219, 2018.
  57. Machine learning force fields. Chemical Reviews, 121(16):10142–10186, 2021.
  58. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys., 7:3297, 2005. doi: 10.1039/b508541a.
  59. 3D steerable CNNs: Learning rotationally equivariant features in volumetric data. Advances in Neural Information Processing Systems, 31, 2018.
  60. Geodiff: A geometric diffusion model for molecular conformation generation. arXiv preprint arXiv:2203.02923, 2022.
  61. Assessing the design rules of electrides. Journal of Materials Chemistry C, 2024.
  62. Local structure order parameters and site fingerprints for quantification of coordination environment and crystal structure similarity. RSC advances, 10(10):6063–6081, 2020.
  63. Spherical channels for modeling atomic interactions. Advances in Neural Information Processing Systems, 35:8054–8067, 2022.
Citations (2)

Summary

  • The paper introduces a novel ML method that uses an equivariant spherical channel network to predict charge densities efficiently and accurately in DFT.
  • It leverages expressive, even-tempered Gaussian basis sets with learnable scaling factors to balance computational cost and precision.
  • The proposed approach outperforms previous methods with a 0.178% NMAE and a 31.7x speedup on the QM9 charge density dataset.

Machine Learning for Efficient and Accurate Charge Density Prediction in Density Functional Theory

The paper in focus addresses the challenging problem of predicting charge densities in atomic systems using machine learning to significantly improve both accuracy and efficiency. Rooted in Density Functional Theory (DFT), the charge density is fundamental to determining all electronic properties of a system. However, traditional DFT methods, particularly the widely used Kohn-Sham formalism, are computationally intensive, scaling at approximately O(Ne3)O(N_e^3), where NeN_e is the number of electrons. This scaling limits the practical application of DFT in large-scale systems and long-timescale molecular dynamics simulations.

Problem Statement and Proposed Solution

The primary challenge in predicting charge densities lies in balancing efficiency and accuracy. Traditional methods suffer from either inefficiency or limited accuracy. The paper introduces a novel ML approach that employs three key components to achieve both high accuracy and scalability:

  1. Representation of charge density using atomic and virtual orbitals, leveraging their spherical field properties.
  2. Implementation of expressive and learnable orbital basis sets, allowing for flexible accuracy-efficiency trade-offs.
  3. Utilization of a high-capacity equivariant neural network architecture, specifically an equivariant spherical channel network (eSCN).

Methodology

Charge Density Representation

The charge density is represented using Gaussian-type orbitals (GTOs), defined by spherical Gaussian functions centered at atomic or virtual coordinates. The approach adopts:

  • Virtual Orbitals: Spherical fields at positions other than atomic centers to capture non-local electronic structures. These virtual orbitals are placed at the midpoints of chemical bonds.
  • Even-Tempered Gaussian Basis Sets: A method to generate more expressive basis sets by defining exponents αk\alpha_k in a controlled manner, thus allowing the model to handle a varied range of atomic environments.
  • Learnable Scaling Factors: Trainable parameters associated with orbital exponents to further enhance the basis set’s flexibility and expressivity.

Prediction Model

The prediction model is built using an eSCN, which is designed to be SE(3)-equivariant, maintaining rotational and translational symmetries. The model predicts coefficients and scaling factors for the orbital basis functions, and the charge density is evaluated using the learned parameters. Training instability due to the learnable scaling factors is mitigated through a fine-tuning approach, stabilizing the training process.

Experimental Validation

QM9 Charge Density Benchmark

The models were tested on the QM9 charge density dataset, containing charge density calculations for 133,845 organic molecules. Performance metrics used include the Normalized Mean Absolute Error (NMAE) and computational efficiency in terms of predictions per minute on a GPU.

Results

The proposed method achieved a substantial improvement in both accuracy and efficiency:

  • The best model achieved an NMAE of 0.178%, outperforming the previous state-of-the-art model ChargE3Net by a significant margin while being 31.7 times faster.
  • Various configurations of the model demonstrated flexible trade-offs between accuracy and efficiency, illustrating the robustness and adaptability of the proposed approach.

Implications and Future Directions

Practical Implications

The results demonstrate a significant advancement in the accuracy-efficiency trade-off for charge density prediction. This progress has practical implications for accelerating DFT calculations, enabling faster and more efficient simulations of large and complex molecular systems.

Theoretical Implications

From a theoretical perspective, the introduction of virtual orbitals, expressive basis sets, and high-capacity equivariant networks offers a new paradigm in representing and learning charge densities. This approach may influence subsequent developments in ML for quantum chemistry, particularly in creating models that maintain physical symmetries and leverage domain-specific knowledge.

Speculative Future Developments

Future research can explore several avenues:

  • Optimizing Virtual Orbital Placement: Employing generative models to learn optimal placements of virtual orbitals could further enhance accuracy.
  • Exploring Alternative Basis Functions: Testing other basis functions, such as Slater-type orbitals or non-decay radial functions, may yield additional performance improvements.
  • Extending to Crystalline Materials: Validating and adapting the approach for use in crystalline materials could broaden its applicability and demonstrate its utility in a wider range of chemical and material science problems.

Conclusion

The proposed method sets a new benchmark in machine learning for charge density prediction, effectively balancing the trade-offs between accuracy and computational efficiency. The implications of this research are manifold, potentially transforming computational practices in quantum chemistry and materials science while providing a foundation for further innovations in the field.