Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Operator SVD with Neural Networks via Nested Low-Rank Approximation (2402.03655v2)

Published 6 Feb 2024 in cs.LG, cs.NA, math.NA, and stat.ML

Abstract: Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called \emph{nesting} for learning the top-$L$ singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (90)
  1. Ivan Markovsky. Low rank approximation: algorithms, implementation, applications, volume 906. Springer, 2012.
  2. Foundations of data science. Cambridge University Press, 2020.
  3. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput., 10(5):1299–1319, 1998.
  4. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, 2000.
  5. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
  6. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.
  7. On spectral clustering: Analysis and an algorithm. In Adv. Neural Inf. Proc. Syst., volume 14, pages 849–856, 2001.
  8. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15(6):1373–1396, 2003.
  9. Spectral clustering and kernel PCA are learning eigenfunctions, volume 1239. CIRANO, 2003.
  10. Multidimensional scaling. In Handbook of data visualization, pages 315–347. Springer, 2008.
  11. Nonparametric canonical correlation analysis. In Proc. Int. Conf. Mach. Learn., pages 1967–1976, 2016.
  12. An Efficient Approach to Informative Feature Extraction from Multimodal Data. In Proc. AAAI Conf. Artif. Int., volume 33, pages 5281–5288, July 2019. doi: 10.1609/aaai.v33i01.33015281.
  13. Harry Andrews and C Patterson. Singular value decompositions and digital image processing. IEEE Trans. Acoust. Speech Signal Process., 24(1):26–53, 1976.
  14. Eigenfaces for recognition. J. Cogn. Neurosci., 3(1):71–86, 1991.
  15. Slow feature analysis: Unsupervised learning of invariances. Neural Comput., 14(4):715–770, 2002.
  16. Henning Sprekeler. On the relation of slow feature analysis and Laplacian eigenmaps. Neural Comput., 23(12):3287–3302, 2011.
  17. Deep k-svd denoising. IEEE Trans. Image Proc., 30:5944–5955, 2021.
  18. An introduction to latent semantic analysis. Discourse Process., 25(2-3):259–284, 1998.
  19. word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722, 2014.
  20. Deep-neural-network solution of the electronic Schrödinger equation. Nat. Chem., 12(10):891–897, October 2020. ISSN 1755-4330, 1755-4349. doi: 10.1038/s41557-020-0544-y.
  21. Ab initio solution of the many-electron Schrödinger equation with deep neural networks. Phys. Rev. Res., 2(3):033429, September 2020. doi: 10.1103/PhysRevResearch.2.033429.
  22. Matrix computations. JHU press, 2013.
  23. Using the Nyström method to speed up kernel machines. In Adv. Neural Inf. Proc. Syst., volume 13, 2000.
  24. Learning eigenfunctions links spectral embedding and kernel pca. Neural Comput., 16(10):2197–2219, 2004.
  25. Erhard Schmidt. Zur Theorie der linearen und nichtlinearen Integralgleichungen. Math. Ann., 63(4):433–476, December 1907. ISSN 0025-5831, 1432-1807. doi: 10.1007/BF01449770.
  26. Spectral inference networks: Unifying deep and spectral learning. In Int. Conf. Learn. Repr., 2019.
  27. NeuralEF: Deconstructing kernels by deep neural networks. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proc. Int. Conf. Mach. Learn., volume 162 of Proceedings of Machine Learning Research, pages 4976–4992. PMLR, 17–23 Jul 2022a. URL https://proceedings.mlr.press/v162/deng22b.html.
  28. EigenGame: PCA as a Nash equilibrium. In Int. Conf. Learn. Repr., 2021.
  29. The approximation of one matrix by another of lower rank. Psychometrika, 1(3):211–218, September 1936. ISSN 0033-3123, 1860-0980. doi: 10.1007/BF02288367.
  30. Introduction to quantum mechanics. Cambridge university press, 2018.
  31. A dive into spectral inference networks: improved algorithms for self-supervised learning of continuous spectral representations. Appl. Math. Mech., 44(7):1199–1224, July 2023. ISSN 0253-4827, 1573-2754. doi: 10.1007/s10483-023-2998-7.
  32. On universal features for high-dimensional learning and inference. arXiv preprint arXiv:1911.09105, 2019.
  33. A geometric framework for neural feature learning. arXiv preprint arXiv:2309.10140, 2023.
  34. H O Hirschfeld. A Connection between Correlation and Contingency. Math. Proc. Cambridge Philos. Soc., 31(4):520–524, October 1935. ISSN 1469-8064, 0305-0041. doi: 10.1017/S0305004100013517.
  35. Hans Gebelein. Das statistische Problem der Korrelation als Variations- und Eigenwertproblem und sein Zusammenhang mit der Ausgleichsrechnung. ZAMM Z. Angew. Math. Mech., 21(6):364–379, 1941. ISSN 0044-2267, 1521-4001. doi: 10.1002/zamm.19410210604.
  36. A Rényi. On measures of dependence. Acta Math. Hungarica, 10(3):441–451, September 1959. ISSN 1588-2632. doi: 10.1007/BF02024507.
  37. The sketchy database: Learning to retrieve badly drawn bunnies. ACM Trans. Graph. (Proc. SIGGRAPH), 2016.
  38. Deep sketch hashing: Fast free-hand sketch-based image retrieval. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pages 2862–2871, 2017.
  39. Learning cross-aligned latent embeddings for zero-shot cross-modal retrieval. In Proc. AAAI Conf. Artif. Int., volume 34, pages 11515–11522, 2020.
  40. Variational interaction information maximization for cross-domain disentanglement. In Adv. Neural Inf. Proc. Syst., volume 33, 2020.
  41. A zero-shot framework for sketch based image retrieval. In Proc. Eur. Conf. Comput. Vis., pages 300–317, 2018.
  42. Marianna Bolla. Spectral clustering and biclustering: Learning large graphs and contingency tables. John Wiley & Sons, 2013.
  43. Numerical linear algebra, volume 181. SIAM, 2022.
  44. G W Stewart. Fredholm, Hilbert, Schmidt: three fundamental papers on integral equations. , 2011. URL http://users.umiacs.umd.edu/~stewart/FHS.pdf.
  45. L Mirsky. Symmetric gauge functions and unitarily invariant norms. Q. J. Math., 11(1):50–59, January 1960. ISSN 0033-5606. doi: 10.1093/qmath/11.1.50.
  46. G W Stewart. On the early history of the singular value decomposition. SIAM Rev. Soc. Ind. Appl. Math., 35(4):551–566, December 1993. ISSN 0036-1445, 1095-7200. doi: 10.1137/1035134.
  47. Michael J Greenacre. Theory and applications of correspondence analysis. London (UK) Academic Press, 1984.
  48. Correspondence Analysis Using Neural Networks. In Int. Conf. Artif. Int. Statist., February 2019.
  49. Generalizing Correspondence Analysis for Applications in Machine Learning. IEEE Trans. Pattern Anal. Mach. Intell., 44(12):9347–9362, December 2022. ISSN 0162-8828. doi: 10.1109/TPAMI.2021.3127870.
  50. Provable guarantees for self-supervised deep learning with spectral contrastive loss. In Adv. Neural Inf. Proc. Syst., 2021.
  51. Self-supervised learning from a multi-view perspective. In Int. Conf. Learn. Repr., 2020.
  52. An information theoretic interpretation to deep neural networks. Entropy, 24(1):135, 2022.
  53. Bo Hu and Jose C Principe. The cross density kernel function: A novel framework to quantify statistical dependence for random processes. arXiv preprint arXiv:2212.04631, 2022.
  54. Terence D Sanger. Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw., 2(6):459–473, January 1989. ISSN 0893-6080. doi: 10.1016/0893-6080(89)90044-0.
  55. Matryoshka representation learning. In Adv. Neural Inf. Proc. Syst., volume 35, pages 30233–30249, 2022.
  56. Deep canonical correlation analysis. In Proc. Int. Conf. Mach. Learn., pages 1247–1255. PMLR, 2013.
  57. Artificial neural network methods in quantum mechanics. Comput. Phys. Commun., 104(1-3):1–14, August 1997. ISSN 0010-4655, 1879-2944. doi: 10.1016/s0010-4655(97)00054-4.
  58. Unsupervised Deep Learning Algorithm for PDE-based Forward and Inverse Problems. arXiv, April 2019.
  59. Deep Learning Solution of the Eigenvalue Problem for Differential Operators. Neural Comput., 35(6):1100–1134, May 2023. ISSN 0899-7667, 1530-888X. doi: 10.1162/neco_a_01583.
  60. Neural-network-based multistate solver for a static Schrödinger equation. Phys. Rev. A, 103(3):032405, March 2021a. ISSN 1050-2947. doi: 10.1103/PhysRevA.103.032405.
  61. Solving eigenvalue PDEs of metastable diffusion processes using artificial neural networks. J. Comput. Phys., 465:111377, September 2022. ISSN 0021-9991. doi: 10.1016/j.jcp.2022.111377.
  62. Calculating many excited states of the multidimensional time-independent Schrödinger equation using a neural network. Phys. Rev. A, 108(3):032803, September 2023. ISSN 1050-2947. doi: 10.1103/PhysRevA.108.032803.
  63. Computing Multi-Eigenpairs of High-Dimensional Eigenvalue Problems Using Tensor Neural Networks. arXiv, May 2023.
  64. A Deep Learning Method for Computing Eigenvalues of the Fractional Schrödinger Operator. arXiv, August 2023.
  65. First principles physics-informed neural network for quantum wavefunctions and eigenvalue surfaces. In Machine Learning and the Physical Sciences workshop, December 2022.
  66. Unsupervised Neural Networks for Quantum Eigenvalue Problems. arXiv, October 2020.
  67. Physics-Informed Neural Networks for Quantum Eigenvalue Problems. In Proc. Int. Jt. Conf. Neural Netw., pages 1–8, July 2022. doi: 10.1109/IJCNN55064.2022.9891944.
  68. Solving two-dimensional quantum eigenvalue problems using physics-informed machine learning. arXiv, February 2023.
  69. Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion Monte Carlo like approach. J. Comput. Phys., 423(109792):109792, December 2020. ISSN 0021-9991, 1090-2716. doi: 10.1016/j.jcp.2020.109792.
  70. Neural networks based on power method and inverse power method for solving linear eigenvalue problems. Comput. Math. Appl., 147:14–24, October 2023. ISSN 0898-1221. doi: 10.1016/j.camwa.2023.07.013.
  71. A semigroup method for high dimensional elliptic PDEs and eigenvalue problems based on neural networks. arXiv, May 2021.
  72. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun Math Stat., 6(1):1–12, 2018.
  73. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys., 375:1339–1364, 2018.
  74. Fourier neural operator for parametric partial differential equations. In Int. Conf. Learn. Repr., 2021b.
  75. Solving the quantum many-body problem with artificial neural networks. Science, 355(6325):602–606, February 2017. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.aag2302.
  76. Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Adv. Neural Inf. Proc. Syst., volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/303ed4c69846ab36c2904d3ba8573050-Paper.pdf.
  77. Fermionic neural-network states for ab-initio electronic structure. Nat. Commun., 11(1):2368, May 2020. ISSN 2041-1723. doi: 10.1038/s41467-020-15724-9.
  78. Gold-standard solutions to the Schrödinger equation using deep learning: How much physics do we need? In Adv. Neural Inf. Proc. Syst., December 2022.
  79. Ab-initio quantum chemistry with neural-network wavefunctions. arXiv, August 2022.
  80. Variational Principles in Quantum Monte Carlo: The Troubled Story of Variance Minimization. J. Chem. Theory Comput., 16(7):4203–4212, July 2020. ISSN 1549-9618, 1549-9626. doi: 10.1021/acs.jctc.0c00147.
  81. Electronic excited states in deep variational Monte Carlo. Nat. Commun., 14(1):274, January 2023. ISSN 2041-1723. doi: 10.1038/s41467-022-35534-5.
  82. Natural Quantum Monte Carlo Computation of Excited States. arXiv, August 2023.
  83. Neural Eigenfunctions Are Structured Representation Learners. arXiv, October 2022b.
  84. EigenGame unloaded: When playing games is better than optimizing. In Int. Conf. Learn. Repr., 2022. URL https://openreview.net/forum?id=So6YAqnqgMj.
  85. Analytic solution of a two-dimensional hydrogen atom. I. Nonrelativistic theory. Phys. Rev. A, 43(3):1186–1196, February 1991. ISSN 1050-2947. doi: 10.1103/physreva.43.1186.
  86. Fourier features let networks learn high frequency functions in low dimensional domains. In Adv. Neural Inf. Proc. Syst., volume 33, pages 7537–7547, 2020.
  87. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. , 2012.
  88. SGDR: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
  89. Very deep convolutional networks for large-scale image recognition. In Int. Conf. Learn. Repr., 2015.
  90. Semantically tied paired cycle consistency for zero-shot sketch-based image retrieval. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pages 5089–5098, 2019.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com