Symbolic Regression in Materials Science: Discovering Interatomic Potentials from Data (2206.06422v2)
Abstract: Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow to calculate the potential energy of an atomic system as a function of atomic coordinates and potentially other properties. First-principles-based ab initio potentials can reach arbitrary levels of accuracy, however their aplicability is limited by their high computational cost. Machine learning (ML) has recently emerged as an effective way to offset the high computational costs of ab initio atomic potentials by replacing expensive models with highly efficient surrogates trained on electronic structure data. Among a plethora of current methods, symbolic regression (SR) is gaining traction as a powerful "white-box" approach for discovering functional forms of interatomic potentials. This contribution discusses the role of symbolic regression in Materials Science (MS) and offers a comprehensive overview of current methodological challenges and state-of-the-art results. A genetic programming-based approach for modeling atomic potentials from raw data (consisting of snapshots of atomic positions and associated potential energy) is presented and empirically validated on ab initio electronic structure data.
- A. Agrawal and A. Choudhary. Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science. APL Materials, 4(5):053208, 2016.
- A comparative review of 50 analytical representation of potential energy interaction for diatomic systems: 100 years of history. International Journal of Quantum Chemistry, 121(24):e26808, 2021.
- James E. Baker. Reducing bias and inefficiency in the selection algorithm. In Proceedings of the Second International Conference on Genetic Algorithms on Genetic Algorithms and Their Application, pages 14–21, USA, 1987. L. Erlbaum Associates Inc.
- Support vector machine regression (ls-svm)—an alternative to artificial neural networks (anns) for the analysis of quantum chemistry data? Phys. Chem. Chem. Phys., 13:11710–11718, 2011.
- On representing chemical environments. Phys. Rev. B, 87:184115, May 2013.
- J. Behler. Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys., 145(17):170901, 2016.
- Empirical valence bond models for reactive potential energy surfaces: A parallel multilevel genetic program approach. The Journal of Chemical Physics, 135(4):044115, 2011.
- Molecular dynamics of excited state intramolecular proton transfer: 3-hydroxyflavone in solution. The Journal of Chemical Physics, 136(19):194505, 2012.
- Monte carlo simulation in statistical physics. Computers in Physics, 7(2):156–157, 1993.
- Quantum and classical studies of vibrational motion of ch5+ on a global potential energy surface obtained from a novel ab initio direct dynamics approach. J. Chem. Phys., 121(9):4105–4116, 2004.
- Bridging scales from ab initio models to predictive empirical models for complex materials. Technical report, Laboratories, Sandia National, 2008.
- Efficient hybrid evolutionary optimization of interatomic potential models. J. Chem. Phys., 132(2):024108, 2010.
- Operon C++: An efficient genetic programming framework for symbolic regression. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO ’20, pages 1562–1570, internet, July 8-12 2020. Association for Computing Machinery.
- Contemporary symbolic regression methods and their relative performance. CoRR, abs/2107.14351, 2021.
- Fitting potential energy surfaces with fundamental invariant neural network. ii. generating fundamental invariants for molecular systems with up to ten atoms. J. Chem. Phys., 152(20):204307, 2020.
- A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput, 6(2):182–197, 2002.
- P. O. Dral. Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett., 11(6):2336–2347, 2020. PMID: 32125858.
- Genetic programming-based learning of carbon interatomic potential for materials discovery, 2022.
- Genericity in evolutionary computation software tools: Principles and case study. International Journal on Artificial Intelligence Tools, 15(2):173–194, April 2006.
- Improve the performance of machine-learning potentials by optimizing descriptors. J. Chem. Phys., 150(24):244110, 2019.
- Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett., 114:105503, Mar 2015.
- Eigen v3. http://eigen.tuxfamily.org, 2010.
- C. M. Handley and J. Behler. Next generation interatomic potentials for condensed systems. European Physical Journal B, 87(7):152, July 2014.
- Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. NPJ Computational Materials, 5(1):112, 2019.
- Machine learning and big scientific data. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 378(2166):20190054, 2020.
- Data-driven materials science: Status, challenges, and perspectives. Advanced Science, 6(21):1900808, 2019.
- Molecular dynamics simulations: advances and applications. Advances and applications in bioinformatics and chemistry: AABC, 8:37, 2015.
- The hierarchical fair competition (hfc) framework for sustainable evolutionary algorithms. Evolutionary Computation, 13(2):241–277, 06 2005.
- J. Ischtwan and M. A. Collins. Molecular potential energy surfaces by interpolation. J. Chem. Phys., 100(11):8080–8088, 1994.
- Symbolic regression of interatomic potentials via genetic programming. Biol. Chem. Res, 2:1–10, 2015.
- From organized high-throughput data to phenomenological theory using machine learning: The example of dielectric breakdown. Chemistry of Materials, 28(5):1304–1311, 2016.
- Dynamics calculations for the lih+h li+h2 reactions using interpolations of accurate ab initio potential energy surfaces. J. Chem. Phys., 119(9):4689–4693, 2003.
- W. Kohn and L. J. Sham. Self-consistent equations including exchange and correlation effects. Phys. Rev., 140:A1133–A1138, Nov 1965.
- John R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992.
- G. Kresse and J. Furthmüller. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B, 54:11169–11186, Oct 1996.
- Machine learning in materials science: Recent progress and emerging applications. Reviews in Computational Chemistry, 2016-05-06 2016.
- D. E. Makarov and H. Metiu. Fitting potential-energy surfaces: A search in the function space by directed genetic programming. J. Chem. Phys., 108(2):590–598, 1998.
- Using genetic programming to solve the schrödinger equation. The Journal of Physical Chemistry A, 104(37):8540–8545, 2000.
- Machine learning for interatomic potential models. J. Chem. Phys., 152(5):050902, 2020.
- Origins of hole traps in hydrogenated nanocrystalline and amorphous silicon revealed through machine learning. Phys. Rev. B, 89:115202, 2014.
- Ghanshyam Pilania. Machine learning in materials science: From explainable predictions to autonomous design. Computational Materials Science, 193:110360, 2021.
- Steve Plimpton. Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics, 117(1):1–19, 1995.
- Machine Learning Potentials - State of the Research and Potential Applications for Carbon Nanostructures. Technische Universität, Faculty of Natural Sciences, Institute of Physics, 2019.
- Kumara Narasimha Sastry. Genetic algorithms and genetic programming for multiscale modeling: Applications in materials science and chemistry and advances in scalability. PhD thesis, University of Illinois, Urbana-Champaign, March 2007.
- Communication: Fitting potential energy surfaces with fundamental invariant neural network. J. Chem. Phys., 145(7):071101, 2016.
- A. V. Shapeev. Moment tensor potentials: A class of systematically improvable interatomic potentials. Multiscale Modeling & Simulation, 14(3):1153–1173, 2016.
- Searching for globally optimal functional forms for interatomic potentials using genetic programming with parallel tempering. J. Comput. Chem., 28(15):2465–2471, 2007.
- Comparative study of empirical internuclear potential functions. Rev. Mod. Phys., 34:239–251, Apr 1962.
- Computer simulation of local order in condensed phases of silicon. Phys. Rev. B, 31:5262–5271, Apr 1985.
- A. P. Sutton and J. Chen. Long-range finnis–sinclair potentials. Philosophical Magazine Letters, 61(3):139–146, 1990.
- Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys., 285:316–330, 2015.
- Machine learning force fields. Chemical Reviews, 0(0):null, 2021. PMID: 33705118.
- Symbolic regression in materials science. MRS Communications, 9(3):793–805, 2019.
- Deep potential molecular dynamics: A scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett., page 143001, 2018.