Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond the Typical: Modeling Rare Plausible Patterns in Chemical Reactions by Leveraging Sequential Mixture-of-Experts (2310.04674v2)

Published 7 Oct 2023 in cs.LG and physics.chem-ph

Abstract: Reaction prediction, a critical task in synthetic chemistry, is to predict the outcome of a reaction based on given reactants. Generative models like Transformer and VAE have typically been employed to predict the reaction product. However, these likelihood-maximization models overlooked the inherent stochastic nature of chemical reactions, such as the multiple ways electrons can be redistributed among atoms during the reaction process. In scenarios where similar reactants could follow different electron redistribution patterns, these models typically predict the most common outcomes, neglecting less frequent but potentially crucial reaction patterns. These overlooked patterns, though rare, can lead to innovative methods for designing synthetic routes and significantly advance synthesis techniques. To break the limits of previous approaches, we propose organizing the mapping space between reactants and electron redistribution patterns in a divide-and-conquer manner. We address the reaction problem by training multiple expert models, each specializing in capturing a type of electron redistribution pattern in reaction. These experts enhance the prediction process by considering both typical and other less common electron redistribution manners. In the inference stage, a dropout strategy is applied to each expert to improve the electron redistribution diversity. The most plausible products are finally predicted through a ranking stage designed to integrate the predictions from multiple experts. Experimental results on the largest reaction prediction benchmark USPTO-MIT show the superior performance of our proposed method compared to baselines.

Citations (3)

Summary

We haven't generated a summary for this paper yet.