Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reconstructing Kernel-based Machine Learning Force Fields with Super-linear Convergence (2212.12737v2)

Published 24 Dec 2022 in physics.chem-ph, cs.LG, physics.comp-ph, and stat.ML

Abstract: Kernel machines have sustained continuous progress in the field of quantum chemistry. In particular, they have proven to be successful in the low-data regime of force field reconstruction. This is because many equivariances and invariances due to physical symmetries can be incorporated into the kernel function to compensate for much larger datasets. So far, the scalability of kernel machines has however been hindered by its quadratic memory and cubical runtime complexity in the number of training points. While it is known, that iterative Krylov subspace solvers can overcome these burdens, their convergence crucially relies on effective preconditioners, which are elusive in practice. Effective preconditioners need to partially pre-solve the learning problem in a computationally cheap and numerically robust manner. Here, we consider the broad class of Nystr\"om-type methods to construct preconditioners based on successively more sophisticated low-rank approximations of the original kernel matrix, each of which provides a different set of computational trade-offs. All considered methods aim to identify a representative subset of inducing (kernel) columns to approximate the dominant kernel spectrum.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Frank, J. T.; Unke, O. T.; Müller, K.-R. So3krates: Equivariant attention for interactions on arbitrary length-scales in molecular systems. Advances in Neural Information Processing Systems 2022, 35.
  2. Unke, O. T.; Stöhr, M.; Ganscha, S.; Unterthiner, T.; Maennel, H.; Kashubin, S.; Ahlin, D.; Gastegger, M.; Sandonas, L. M.; Tkatchenko, A., et al. Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations. arXiv preprint arXiv:2205.08306 2022,
  3. Gardner, J.; Pleiss, G.; Weinberger, K. Q.; Bindel, D.; Wilson, A. G. Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in Neural Information Processing Systems 2018, 31.
  4. Wang, K.; Pleiss, G.; Gardner, J.; Tyree, S.; Weinberger, K. Q.; Wilson, A. G. Exact Gaussian processes on a million data points. Advances in Neural Information Processing Systems 2019, 32.
  5. Williams, C.; Seeger, M. Using the Nyström method to speed up kernel machines. Advances in Neural Information Processing Systems 2000, 13.
  6. Patel, R.; Goldstein, T.; Dyer, E.; Mirhoseini, A.; Baraniuk, R. Deterministic column sampling for low-rank matrix approximation: Nyström vs. incomplete Cholesky decomposition. International Conference on Data Mining. 2016; pp 594–602.
  7. Kumar, S.; Mohri, M.; Talwalkar, A. Sampling techniques for the nystrom method. International Conference on Artificial Intelligence and Statistics. 2009; pp 304–311.
  8. Musco, C.; Musco, C. Recursive sampling for the nystrom method. Advances in Neural Information Processing Systems 2017, 30.
  9. Freitas, N.; Wang, Y.; Mahdaviani, M.; Lang, D. Fast Krylov methods for N-body learning. Advances in Neural Information Processing Systems 2005, 18.
  10. Srinivasan, B. V.; Hu, Q.; Gumerov, N. A.; Murtugudde, R.; Duraiswami, R. Preconditioned Krylov solvers for kernel regression. arXiv preprint arXiv:1408.1237 2014,
  11. Rudi, A.; Carratino, L.; Rosasco, L. Falkon: An optimal large scale kernel method. Advances in Neural Information Processing Systems 2017, 30.
  12. Hackbusch, W. Iterative solution of large sparse systems of equations; Springer, 1994; Vol. 95.
  13. Saad, Y. Iterative methods for sparse linear systems; SIAM, 2003.
  14. Cutajar, K.; Osborne, M.; Cunningham, J.; Filippone, M. Preconditioning kernel matrices. International Conference on Machine Learning. 2016; pp 2529–2538.
  15. Ma, S.; Belkin, M. Diving into the shallows: a computational perspective on large-scale shallow learning. Advances in Neural Information Processing Systems 2017, 30.
  16. Zhou, T.; Tao, D. Godec: Randomized low-rank & sparse matrix decomposition in noisy case. International Conference on Machine Learning. 2011; p 33–40.
  17. Li, C.; Jegelka, S.; Sra, S. Fast dpp sampling for nystrom with application to kernel methods. International Conference on Machine Learning. 2016; pp 2061–2070.
  18. Smola, A. J.; Schökopf, B. Sparse Greedy Matrix Approximation for Machine Learning. International Conference on Machine Learning. 2000; pp 911–918.
  19. Zhang, K.; Tsang, I. W.; Kwok, J. T. Improved Nyström low-rank approximation and error analysis. International Conference on Machine Mearning. 2008; pp 1232–1239.
  20. Frangella, Z.; Tropp, J. A.; Udell, M. Randomized Nyström Preconditioning. arXiv preprint arXiv:2110.02820 2021,
  21. Kim, H.; Teh, Y. W. Scaling up the Automatic Statistician: Scalable structure discovery using Gaussian processes. International Conference on Artificial Intelligence and Statistics. 2018; pp 575–584.
  22. Foster, L.; Waagen, A.; Aijaz, N.; Hurley, M.; Luis, A.; Rinsky, J.; Satyavolu, C.; Way, M. J.; Gazis, P.; Srivastava, A. Stable and Efficient Gaussian Process Calculations. J. Mach. Learn. Res. 2009, 10.
  23. Bach, F. Sharp analysis of low-rank kernel matrix approximations. Conference on learning theory. 2013; pp 185–209.
  24. Cohen, M. B.; Lee, Y. T.; Musco, C.; Musco, C.; Peng, R.; Sidford, A. Uniform sampling for matrix approximation. Conference on Innovations in Theoretical Computer Science. 2015; pp 181–190.
  25. Ipsen, I. C.; Wentworth, T. Sensitivity of leverage scores. arXiv preprint arXiv:1402.0957 2014,
  26. Alaoui, A.; Mahoney, M. W. Fast randomized kernel ridge regression with statistical guarantees. Advances in Neural Information Processing Systems 2015, 28.
  27. McCurdy, S. Ridge regression and provable deterministic ridge leverage score sampling. Advances in Neural Information Processing Systems 2018, 31.
  28. Si, S.; Hsieh, C.-J.; Dhillon, I. Memory efficient kernel approximation. International Conference on Machine Learning. 2014; pp 701–709.
  29. Kabylda, A.; Vassilev-Galindo, V.; Chmiela, S.; Poltavsky, I.; Tkatchenko, A. Towards linearly scaling and chemically accurate global machine learning force fields for large molecules. arXiv preprint arXiv:2209.03985 2022,
  30. Papailiopoulos, D.; Kyrillidis, A.; Boutsidis, C. Provable deterministic leverage score sampling. International Conference on Knowledge Discovery and Data Mining. 2014; pp 997–1006.
Citations (4)

Summary

We haven't generated a summary for this paper yet.