Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A critical examination of compound stability predictions from machine-learned formation energies (2001.10591v2)

Published 28 Jan 2020 in cond-mat.mtrl-sci and physics.comp-ph

Abstract: Machine learning has emerged as a novel tool for the efficient prediction of materials properties, and claims have been made that machine-learned models for the formation energy of compounds can approach the accuracy of Density Functional Theory (DFT). The models tested in this work include five recently published compositional models, a baseline model using stoichiometry alone, and a structural model. By testing seven machine learning models for formation energy on stability predictions using the Materials Project database of DFT calculations for 85,014 unique chemical compositions, we show that while formation energies can indeed be predicted well, all compositional models perform poorly on predicting the stability of compounds, making them considerably less useful than DFT for the discovery and design of new solids. Most critically, in sparse chemical spaces where few stoichiometries have stable compounds, only the structural model is capable of efficiently detecting which materials are stable. The non-incremental improvement of structural models compared with compositional models is noteworthy and encourages the use of structural models for materials discovery, with the constraint that for any new composition, the ground-state structure is not known a priori. This work demonstrates that accurate predictions of formation energy do not imply accurate predictions of stability, emphasizing the importance of assessing model performance on stability predictions, for which we provide a set of publicly available tests.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Christopher J. Bartel (21 papers)
  2. Amalie Trewartha (14 papers)
  3. Qi Wang (561 papers)
  4. Alexander Dunn (17 papers)
  5. Anubhav Jain (33 papers)
  6. Gerbrand Ceder (72 papers)
Citations (161)

Summary

A Critical Examination of Compound Stability Predictions from Machine-Learned Formation Energies

In the paper under consideration, the authors provide a detailed evaluation of the efficacy of machine-learned formation energy models in predicting the stability of various chemical compounds. The assessment juxtaposes these models' predictions with those derived from Density Functional Theory (DFT) calculations, employing a substantial dataset of 85,014 unique chemical compositions from the Materials Project. The focus is on determining whether the accurate prediction of formation energies via ML translates directly to reliable stability predictions for new materials.

Key Findings

The paper encompasses seven ML models, including five compositional models—Meredig, Magpie, AutoMat, ElemNet, and Roost—and a structural model, namely the Crystal Graph Convolutional Neural Network (CGCNN). The findings highlight several critical observations:

  1. Formation Energy Prediction: The compositional models can predict the formation energy of compounds with reasonable accuracy. The predictive performance, measured as the mean absolute error (MAE), for these models approaches the discrepancies typically observed between DFT and experimental data.
  2. Stability Prediction Discrepancies: Despite their success in predicting formation energies, the models falter in accurately forecasting compound stability. Specifically, they exhibit considerable inaccuracies when employing formation energies to predict decomposition enthalpies, which determine stability within a given chemical space.
  3. Error Cancellation: DFT methodologies inherently benefit from systematic error cancellation when predicting stability. This attribute enables DFT to yield reasonably accurate stability predictions, a property not inherently shared by the ML models examined. The errors in ML predictions did not demonstrate significant systematic cancellation, leading to poor stability predictions, especially when comparing chemically similar compounds.
  4. Structural Model Efficacy: The CGCNN model, which utilizes structural information, significantly outperformed compositional models in stability predictions. This highlights the critical role of structural attributes in accurately capturing the nuanced differences between stable and unstable compounds. However, the reliance on known crystal structures imposes a limitation, as these structures are typically unavailable for novel compositions targeted in discovery efforts.
  5. Sparse Chemical Spaces: In particular, the paper examined the Li-Mn-TM-O quaternary space and revealed that no compositional model successfully identified all stable compounds. The structural model demonstrated better resilience in this regard, emphasizing the advantages of incorporating structural data in stability assessments.

Implications and Future Directions

The results delineate important implications for the use of machine learning in materials science:

  • Limitations of Compositional Models: The paper underscores that compositional models, despite their advances in predicting formation energies, are not sufficient stand-alone tools for stability prediction. This constraint suggests that expectations for ML to supplant DFT in materials discovery should be tempered, especially for tasks relying on stability assessment.
  • Structural Information Integration: Given the superior performance of structural models like CGCNN, there is a manifest need to develop methods that can predict plausible structures for uncharacterized compositions to leverage the benefits of structural information.
  • Model Evaluation Frameworks: The authors recommend a rigorous framework for evaluating newly developed ML models for formation energies, with a particular emphasis on assessing their stability prediction capabilities in diverse and sparse chemical spaces.

The paper conclusively suggests that while machine learning has marked potential in materials science, especially for formation energy prediction, its application in stability prediction requires further development, particularly in handling error cancellation and incorporating structural insights. Continuing research should focus on methodologies that predict structures or improve the inherent error correction and stability prediction capacity of compositional models. For practical application in the discovery of innovative materials, these improvements are indispensable.