Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space (1909.11655v4)

Published 25 Sep 2019 in cs.NE, cs.LG, physics.chem-ph, and physics.comp-ph

Abstract: Challenges in natural sciences can often be phrased as optimization problems. Machine learning techniques have recently been applied to solve such problems. One example in chemistry is the design of tailor-made organic materials and molecules, which requires efficient methods to explore the chemical space. We present a genetic algorithm (GA) that is enhanced with a neural network (DNN) based discriminator model to improve the diversity of generated molecules and at the same time steer the GA. We show that our algorithm outperforms other generative models in optimization tasks. We furthermore present a way to increase interpretability of genetic algorithms, which helped us to derive design principles.

PDF Abstract

Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space

The paper "Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space" introduces an innovative approach to the optimization problems prevalent in the natural sciences, particularly within chemistry. The research focuses on the design of tailor-made organic materials and molecules and utilizes a hybrid framework of genetic algorithms (GAs) enhanced by deep neural networks (DNNs) to explore chemical space more effectively.

The core proposition of the paper is a genetic algorithm that incorporates a neural network-based adaptive penalty to promote exploratory behavior and increase the diversity of generated molecules. The algorithm leverages \texttt{SELFIES} to ensure molecule validity without requiring domain-specific mutation or crossover rules. The primary enhancement comes from the discriminator model, which is a neural network trained to manage populations within the GA, fostering diversity by penalizing the persistence of long-lasting solutions that tend to stagnate exploration.

Technical Contributions and Results

Hybrid Algorithm Design: The GA is augmented with DNNs to form a hybrid algorithm that not only seeks molecules with properties optimized according to a given fitness function but also ensures an exploration of diverse regions within the chemical space. A key component is Equation 1, where the fitness function $F(m) = J(m) + \beta \cdot D(m)$ combines the property score $J(m)$ with the discriminator score $D(m)$ .
Diversity Enhancement: By deploying an adaptive penalty through a neural network-based discriminator, the algorithm autonomously moves away from local optima, promoting innovation and diversity. This neural network modules the population dynamics, reducing the fitness score of long-surviving molecules and thus reducing their likelihood of dominance over many generations.
Benchmark Performance: The algorithm displays superior performance in molecular design tasks compared to traditional methods such as VAEs, GANs, and other purely generative models. In particular, it achieves higher scores in penalized logP optimization than other published approaches, with statistical significance highlighted in Table 1 of the paper.
Potential Applicability: Although primarily tested in the domain of chemistry, the framework is domain-agnostic, offering a generalized solution applicable across different fields requiring optimization under constraints.

Implications and Future Directions

The research presents significant implications for the theoretical and practical aspects of molecular design. The hybrid GA-DNN model offers a promising avenue for chemical and materials science researchers looking to leverage computational tools for novel material discovery. The algorithm’s efficiency in exploring vast chemical spaces without the need for expert input or pre-defined mutation rules signifies its potential to revolutionize computational design paradigms in organic chemistry.

Moving forward, one proposed extension includes the integration of real-time machine learning models that could predict molecular properties more efficiently during the GA runs, potentially reducing computational load and enabling real-time feedback loops.

In conclusion, this paper extends the capabilities of genetic algorithms within the chemical space by integrating deep learning components, providing a more robust method to tackle complex molecular design tasks. This approach holds promise for broader applications across scientific research areas where optimization and innovation are critical.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

AkshatKumar Nigam (10 papers)
Pascal Friederich (35 papers)
Mario Krenn (74 papers)
Alán Aspuru-Guzik (227 papers)

Citations (123)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/719761285190131712/status/1739858353004708117

YouTube

Show All Videos