Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules (1708.04987v4)

Published 16 Aug 2017 in physics.chem-ph, cs.LG, and physics.data-an

Abstract: One of the grand challenges in modern theoretical chemistry is designing and implementing approximations that expedite ab initio methods without loss of accuracy. Machine learning (ML), in particular neural networks, are emerging as a powerful approach to constructing various forms of transferable atomistic potentials. They have been successfully applied in a variety of applications in chemistry, biology, catalysis, and solid-state physics. However, these models are heavily dependent on the quality and quantity of data used in their fitting. Fitting highly flexible ML potentials comes at a cost: a vast amount of reference data is required to properly train these models. We address this need by providing access to a large computational DFT database, which consists of 20M conformations for 57,454 small organic molecules. We believe it will become a new standard benchmark for comparison of current and future methods in the ML potential community.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Justin S. Smith (21 papers)
  2. Olexandr Isayev (20 papers)
  3. Adrian E. Roitberg (7 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.