Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Algebraic Machine Learning with an Application to Chemistry (2205.05795v4)

Published 11 May 2022 in math.AG, cs.CG, cs.LG, math-ph, and math.MP

Abstract: As datasets used in scientific applications become more complex, studying the geometry and topology of data has become an increasingly prevalent part of the data analysis process. This can be seen for example with the growing interest in topological tools such as persistent homology. However, on the one hand, topological tools are inherently limited to providing only coarse information about the underlying space of the data. On the other hand, more geometric approaches rely predominately on the manifold hypothesis, which asserts that the underlying space is a smooth manifold. This assumption fails for many physical models where the underlying space contains singularities. In this paper we develop a machine learning pipeline that captures fine-grain geometric information without having to rely on any smoothness assumptions. Our approach involves working within the scope of algebraic geometry and algebraic varieties instead of differential geometry and smooth manifolds. In the setting of the variety hypothesis, the learning problem becomes to find the underlying variety using sample data. We cast this learning problem into a Maximum A Posteriori optimization problem which we solve in terms of an eigenvalue computation. Having found the underlying variety, we explore the use of Gr\"obner bases and numerical methods to reveal information about its geometry. In particular, we propose a heuristic for numerically detecting points lying near the singular locus of the underlying variety.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. F. Anet and J. Krane, Strain energy calculation of conformations and conformational changes in cyclooctane, Tetrahedron Letter, 8 (1973), 5029–5052.
  2. D. Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, USA, 2012.
  3. E. Becker and R. Neuhaus, Computation of real radicals of polynomial ideals, in Computational Algebraic Geometry, vol. 109 of Progr. Math., Birkhäuser Boston, 1993, 1–20, URL https://doi.org/10.1007/978-1-4612-2752-6_1.
  4. E. Bierstone and P. D. Milman, Semianalytic and subanalytic sets, Publications mathématiques de l’IHÉS, 67 (1988), 5–42, URL https://doi.org/10.1007/bf02699126.
  5. J. Bochnak, M. Coste and M.-F. Roy, Real algebraic geometry, vol. 36 of Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], Springer-Verlag, Berlin, 1998, Translated from the 1987 French original, Revised by the authors.
  6. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
  7. P. Breiding, S. Kališnik, B. Sturmfels and M. Weinstein, Learning algebraic varieties from samples, Revista Matemática Complutense, 31 (2018), 545–593, URL https://doi.org/10.1007/s13163-018-0273-6.
  8. P. Breiding and S. Timme, Homotopycontinuation.jl: A package for homotopy continuation in julia, in Mathematical Software – ICMS 2018 (eds. J. H. Davenport, M. Kauers, G. Labahn and J. Urban), Springer International Publishing, Cham, 2018, 458–465.
  9. G. Carlsson, Topology and data, Bulletin of the American Mathematical Society, 46 (2009), 255–308.
  10. T. H. Colding and W. P. Minicozzi II, Łojasiewicz inequalities and applications, in Surveys in Differential Geometry 2014. Regularity and evolution of nonlinear equations, vol. 19 of Surv. Differ. Geom., Int. Press, Somerville, MA, 2015, 63–82, https://doi.org/10.4310/SDG.2014.v19.n1.a3.
  11. W. Decker, G.-M. Greuel, G. Pfister and H. Schönemann, Singular 4-2-0 - A computer algebra system for polynomial computations, http://www.singular.uni-kl.de, 2020.
  12. E. Dufresne, P. Edwards, H. Harrington and J. Hauenstein, Sampling real algebraic varieties for topological data analysis, in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), 2019, 1531–1536.
  13. M. I. Dykman, V. N. Smelyanskiy, R. S. Maier and M. Silverstein, Singular features of large fluctuations in oscillating chemical systems, The Journal of Physical Chemistry, 100 (1996), 19197–19209, URL https://doi.org/10.1021/jp962746i.
  14. C. Fefferman, S. Mitter and H. Narayanan, Testing the manifold hypothesis, Journal of the American Mathematical Society, 29 (2016), 983–1049, URL https://doi.org/10.1090/jams/852.
  15. O. Gäfvert, Computational complexity of learning algebraic varieties, Advances in Applied Mathematics, 121 (2020), 102100, URL https://doi.org/10.1016/j.aam.2020.102100.
  16. W. Gander, G. H. Golub and R. Strebel, Least-squares fitting of circles and ellipses, BIT, 34 (1994), 558–578, URL https://doi.org/10.1007/bf01934268.
  17. G.-M. Greuel and G. Pfister, A Singular Introduction to Commutative Algebra, 2nd edition, Springer Publishing Company, Incorporated, 2007.
  18. R. Hartshorne, Algebraic Geometry, Graduate Texts in Mathematics, Springer New York, 2013.
  19. J. B. Hendrickson, Molecular geometry. V. Evaluation of functions and conformations of medium rings, J. Am. Chem. Soc., 89 (1967), 7047–7061.
  20. D. T. Huynh, A superexponential lower bound for Gröbner bases and Church-Rosser commutative thue systems, Information and Control, 68 (1986), 196–206, URL https://www.sciencedirect.com/science/article/pii/S0019995886800353.
  21. S. Lang, Algebra, vol. 211 of Graduate Texts in Mathematics, 3rd edition, Springer-Verlag, New York, 2002, URL https://doi.org/10.1007/978-1-4613-0041-0.
  22. J. Leeuw, History of nonlinear principal component analysis, UCLA: Department of Statistics, UCLA (2013), https://escholarship.org/uc/item/1vp9f9kz.
  23. S. Łojasiewicz, Ensemble semi-analytique, 1965, Mimeographié, Institute des Hautes Études Scientifique, Bures-sur-Yvette, France.
  24. S. Martin, A. Thompson, E. A. Coutsias and J.-P. Watson, Topology of cyclo-octane energy landscape, The Journal of Chemical Physics, 132 (2010), 234115, URL https://doi.org/10.1063/1.3445267.
  25. L. McInnes, J. Healy and J. Melville, UMAP: Uniform manifold approximation and projection for dimension reduction, arXiv:1802.03426 (2020).
  26. F. A. Moughari and C. Eslahchi, ADRML: Anticancer drug response prediction using manifold learning, Scientific Reports, 10.
  27. J. Nash, Real algebraic manifolds, Ann. of Math. (2), 56 (1952), 405–421.
  28. R. Neuhaus, Computation of real radicals of polynomial ideals. II, J. Pure Appl. Algebra, 124 (1998), 261–280, URL https://doi.org/10.1016/S0022-4049(96)00103-X.
  29. J. Park, Manifold Learning in Computer Vision, PhD thesis, USA, 2005, AAI3193223.
  30. M. J. Pflaum, Analytic and geometric study of stratified spaces, vol. 1768 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, 2001.
  31. A. Reiman, Singular surfaces in the open field line region of a diverted tokamak, Physics of Plasmas, 3 (1996), 906–913.
  32. L. Rüschendorf, The Wasserstein distance and approximation theorems, Probability Theory and Related Fields, 70 (1985), 117–129.
  33. P. Solernó, Effective Łojasiewicz inequalities in semialgebraic geometry, Applicable Algebra in Engineering, Communication and Computing, 2 (1991), 1–14, URL http://dx.doi.org/10.1007/BF01810850.
  34. S. J. Spang, On the computation of the real radical., PhD thesis, Technische Universität Kaiserslautern, 2007.
  35. B. J. Stolz, J. Tanner, H. A. Harrington and V. Nanda, Geometric anomaly detection in data, Proceedings of the National Academy of Sciences, 117 (2020), 19664–19669, URL https://www.pnas.org/content/117/33/19664.
  36. J. Tenenbaum, V. D. Silva and J. Langford, A global geometric framework for nonlinear dimensionality reduction., Science, 290 5500 (2000), 2319–23.
  37. H. Whitney, Local properties of analytic varieties, in Differential and Combinatorial Topology (A Symposium in Honor of Marston Morse), Princeton Univ. Press, Princeton, N.J., 1965, 205–244.

Summary

We haven't generated a summary for this paper yet.