Generative adversarial networks (GAN) based efficient sampling of chemical space for inverse design of inorganic materials (1911.05020v1)

Published 12 Nov 2019 in cs.LG, cs.NE, and stat.ML

Abstract: A major challenge in materials design is how to efficiently search the vast chemical design space to find the materials with desired properties. One effective strategy is to develop sampling algorithms that can exploit both explicit chemical knowledge and implicit composition rules embodied in the large materials database. Here, we propose a generative machine learning model (MatGAN) based on a generative adversarial network (GAN) for efficient generation of new hypothetical inorganic materials. Trained with materials from the ICSD database, our GAN model can generate hypothetical materials not existing in the training dataset, reaching a novelty of 92.53% when generating 2 million samples. The percentage of chemically valid (charge neutral and electronegativity balanced) samples out of all generated ones reaches 84.5% by our GAN when trained with materials from ICSD even though no such chemical rules are explicitly enforced in our GAN model, indicating its capability to learn implicit chemical composition rules. Our algorithm could be used to speed up inverse design or computational screening of inorganic materials.

Citations (175)

View on Semantic Scholar

Summary

The paper proposes MatGAN, a Generative Adversarial Network (GAN) based model for efficiently sampling inorganic chemical space to enable inverse materials design.
MatGAN achieved high novelty (92.53%) and chemical validity (84.5%) rates for generated hypothetical materials, demonstrating its ability to learn implicit compositional rules from existing databases.
This methodology offers a robust framework for efficient materials discovery, with potential applications in expanding material databases and facilitating high-throughput computational screening.

Efficient Sampling of Chemical Space for Inverse Design of Inorganic Materials Using GANs

The paper proposes a novel approach to efficiently explore the vast chemical design space for inorganic materials using Generative Adversarial Networks (GANs). This research addresses the significant challenge of materials design, where identifying new materials with desired properties is complicated by the sheer magnitude of combinatorial possibilities in their chemical composition.

Core Contributions

The authors introduce MatGAN, a GAN-based generative model, which benefits from implicit chemical composition rules learned from extensive inorganic material databases, such as ICSD, OQMD, and Material Project. The GAN framework, trained on known inorganic materials, significantly enhances the generation of novel materials while preserving chemical validity, demonstrating a high ratio of valid samples despite the lack of explicit rule enforcement during model training.

Methodology

MatGAN represents inorganic materials using a sparse matrix, which facilitates the learning process of the GAN. The generator and discriminator within MatGAN are structured as deep neural networks that leverage convolutional and deconvolutional layers. To optimize training and ensure convergence, the authors deploy a Wasserstein GAN approach, reducing divergence between the generated and real sample distributions through specific loss function metrics.

Results

Novelty and Validity: MatGAN achieved a novelty rate of 92.53% for 2 million samples, with 84.5% meeting the criteria of charge neutrality and balanced electronegativity. This indicates the GAN's capability to internalize implicit compositional rules, markedly outperforming traditional enumeration approaches in generating chemically valid samples.
Formation Energy: The generated samples exhibited formation energy predominantly less than zero, suggesting the hypothetical materials' thermal stability. Specifically, GANs trained on ICSD and MP datasets produced a higher prevalence of low formation energy materials.
Uniqueness: Even after generating significant quantities of samples, the GAN model maintained substantial uniqueness rates—85.9% for GAN-MP, illustrating its effectiveness in producing diverse and novel hypothetical materials beyond the training and validation datasets.
Conditional Generation: When trained with high-bandgap materials, MatGAN successfully generated hypothetical new materials with similar properties, validating its potential for conditional generation aligned with specific material characteristics.

Implications and Future Directions

The methodology presented by the authors provides a robust framework for efficient chemical space exploration, potentially enhancing materials discovery processes across various industries. As demonstrated, GANs can serve as powerful tools for expanding existing materials databases, facilitating high-throughput computational screening, and identifying new compounds with critical properties.

Future research can extend the MatGAN framework to accommodate real-number representations for doped materials and explore integrations with other sampling techniques or computational models to predict crystal structures. Additionally, implementing explicit chemical filters within the GAN architecture may further refine the generation process, aligning with specific structural or composition constraints.

In conclusion, this paper underscores the transformative potential of generative models in materials science, particularly the innovative application of GANs for the inverse design of inorganic materials. The proposed approach opens new pathways for automation in materials discovery, enhancing the speed and scope at which novel materials can be identified for technological and industrial applications.