Gaussian process model of 51-dimensional potential energy surface for protonated imidazole dimer (2001.07271v4)
Abstract: The goal of the present work is to obtain accurate potential energy surfaces (PES) for high-dimensional molecular systems with a small number of ${\it ab}$ ${\it initio}$ calculations in a system-agnostic way. We use probabilistic modeling based on Gaussian processes (GPs). We illustrate that it is possible to build an accurate GP model of a 51-dimensional PES based on $5000$ randomly distributed ${\it ab}$ ${\it initio}$ calculations with a global accuracy of $< 0.2$ kcal/mol. Our approach uses GP models with composite kernels designed to enhance the Bayesian information content and represents the global PES as a sum of a full-dimensional GP and several GP models for molecular fragments of lower dimensionality. We demonstrate the potency of these algorithms by constructing the global PES for the protonated imidazole dimer, a molecular system with $19$ atoms. We illustrate that GP models thus constructed can extrapolate the PES from low energies ($< 10,000$ cm${-1}$), yielding a PES at high energies ($> 20,000$ cm${-1}$). This opens the prospect for new applications of GPs, such as mapping out phase transitions by extrapolation or accelerating Bayesian optimization, for high-dimensional physics and chemistry problems with a restricted number of inputs, i.e. for high-dimensional problems where obtaining training data is very difficult.