Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Holistic chemical evaluation reveals pitfalls in reaction prediction models (2312.09004v1)

Published 14 Dec 2023 in physics.chem-ph and cs.LG

Abstract: The prediction of chemical reactions has gained significant interest within the machine learning community in recent years, owing to its complexity and crucial applications in chemistry. However, model evaluation for this task has been mostly limited to simple metrics like top-k accuracy, which obfuscates fine details of a model's limitations. Inspired by progress in other fields, we propose a new assessment scheme that builds on top of current approaches, steering towards a more holistic evaluation. We introduce the following key components for this goal: CHORISO, a curated dataset along with multiple tailored splits to recreate chemically relevant scenarios, and a collection of metrics that provide a holistic view of a model's advantages and limitations. Application of this method to state-of-the-art models reveals important differences on sensitive fronts, especially stereoselectivity and chemical out-of-distribution generalization. Our work paves the way towards robust prediction models that can ultimately accelerate chemical discovery.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Jablonka, K. M.; Schwaller, P.; Ortega-Guerrero, A.; Smit, B. Is GPT-3 all you need for low-data discovery in chemistry? 2023,
  2. Bran, A. M.; Cox, S.; White, A. D.; Schwaller, P. ChemCrow: Augmenting large-language models with chemistry tools. 2023.
  3. Duvenaud, D.; Maclaurin, D.; Aguilera-Iparraguirre, J.; Gomez-Bombarelli, R.; Hirzel, T.; Aspuru-Guzik, A.; Adams, R. P. Convolutional Networks on Graphs for Learning Molecular Fingerprints.
  4. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. 2017; http://arxiv.org/abs/1706.03762, arXiv:1706.03762 [cs].
  5. Petersen, F.; Kuehne, H.; Borgelt, C.; Deussen, O. Differentiable Top-k Classification Learning. International Conference on Machine Learning (ICML). 2022.
  6. Lowe, D. M. Extraction of chemical structures and reactions from the literature. 2012,
  7. Chanussot*, L. et al. Open Catalyst 2020 (OC20) Dataset and Community Challenges. ACS Catalysis 2021,
  8. Hendrycks, D.; Dietterich, T. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. 2019.
  9. White, A. D.; Hocky, G. M.; Gandhi, H. A.; Ansari, M.; Cox, S.; Wellawatte, G. P.; Sasmal, S.; Yang, Z.; Liu, K.; Singh, Y.; Ccoa, W. J. P. Do large language models know chemistry? 2022; https://chemrxiv.org/engage/chemrxiv/article-details/62c5c622244ce03b8e3c4f21.
  10. Choudhary, K. et al. Large Scale Benchmark of Materials Design Methods. arXiv preprint arXiv:2306.11688 2023,
  11. Ramos, M. C.; Michtavy, S. S.; Porosoff, M. D.; White, A. D. Bayesian Optimization of Catalysts With In-context Learning. arXiv preprint arXiv:2304.05341 2023,
  12. Lowe, D. M.; Corbett, P. T.; Murray-Rust, P.; Glen, R. C. Chemical name to structure: OPSIN, an open source solution. 2011.
  13. Lowe, D. Chemical reactions from US patents (1976-Sep2016). 2017,
  14. Schwaller, P.; Gaudin, T.; Lányi, D.; Bekas, C.; Laino, T. “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 2018, 9, 6091–6098.
  15. Jin, W.; Coley, C.; Barzilay, R.; Jaakkola, T. Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network. Advances in Neural Information Processing Systems. 2017.
  16. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009; pp 248–255.
  17. Mu, N.; Gilmer, J. MNIST-C: A Robustness Benchmark for Computer Vision. 2019.
  18. Chen, M. et al. Evaluating Large Language Models Trained on Code. 2021,
  19. Teney, D.; Kafle, K.; Shrestha, R.; Abbasnejad, E.; Kanan, C.; Hengel, A. v. d. On the Value of Out-of-Distribution Testing: An Example of Goodhart’s Law. 2020; http://arxiv.org/abs/2005.09241, arXiv:2005.09241 [cs].
  20. Shen, Z.; Liu, J.; He, Y.; Zhang, X.; Xu, R.; Yu, H.; Cui, P. Towards Out-Of-Distribution Generalization: A Survey. 2021; http://arxiv.org/abs/2108.13624, arXiv:2108.13624 [cs].
  21. Hu, R.; Sang, J.; Wang, J.; Hu, R.; Jiang, C. Understanding and Testing Generalization of Deep Networks on Out-of-Distribution Data. 2021; http://arxiv.org/abs/2111.09190, arXiv:2111.09190 [cs].
  22. Liang, P. et al. Holistic Evaluation of Language Models. 2022; http://arxiv.org/abs/2211.09110, arXiv:2211.09110 [cs].
  23. Schwartz, R.; Dodge, J.; Smith, N. A.; Etzioni, O. Green AI. 2019,
  24. Bahdanau, D.; Cho, K.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. 2016.
  25. Press, O.; Smith, N. A.; Lewis, M. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation. 2022.
  26. Sun, Y.; Dong, L.; Patra, B.; Ma, S.; Huang, S.; Benhaim, A.; Chaudhary, V.; Song, X.; Wei, F. A Length-Extrapolatable Transformer. 2022.
  27. Budennyy, S.; Lazarev, V.; Zakharenko, N.; Korovin, A.; Plosskaya, O.; Dimitrov, D.; Akhripkin, V.; Pavlov, I.; Oseledets, I.; Barsola, I., et al. Eco2ai: carbon emissions tracking of machine learning models as the first step towards sustainable ai. Doklady Mathematics. 2023; pp 1–11.
  28. Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.-A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; Rodriguez, A.; Joulin, A.; Grave, E.; Lample, G. LLaMA: Open and Efficient Foundation Language Models. 2023.
  29. Ghiandoni, G. M.; Bodkin, M. J.; Chen, B.; Hristozov, D.; Wallace, J. E. A.; Webster, J.; Gillet, V. J. Development and Application of a Data-Driven Reaction Classification Model: Comparison of an Electronic Lab Notebook and Medicinal Chemistry Literature. Journal of Chemical Information and Modeling 2019, Publisher: American Chemical Society.
Citations (1)

Summary

We haven't generated a summary for this paper yet.