Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A method for quantifying the generalization capabilities of generative models for solving Ising models (2405.03435v1)

Published 6 May 2024 in cond-mat.dis-nn, cs.AI, and cs.LG

Abstract: For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilities. Here we design a Hamming distance regularizer in the framework of a class of generative models, variational autoregressive networks (VAN), to quantify the generalization capabilities of various network architectures combined with VAN. The regularizer can control the size of the overlaps between the ground state and the training datasets generated by networks, which, together with the success rates of finding the ground state, form a quantitative metric to quantify their generalization capabilities. We conduct numerical experiments on several prototypical network architectures combined with VAN, including feed-forward neural networks, recurrent neural networks, and graph neural networks, to quantify their generalization capabilities when solving Ising models. Moreover, considering the fact that the quantification of the generalization capabilities of networks on small-scale problems can be used to predict their relative performance on large-scale problems, our method is of great significance for assisting in the Neural Architecture Search field of searching for the optimal network architectures when solving large-scale Ising models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. F. Barahona, On the computational complexity of ising spin glass models, Journal of Physics A: Mathematical and General 15, 3241 (1982).
  2. S. Kirkpatrick, C. D. Gelatt, and A. Vecchi, Optimization by simulated annealing, Science 220, 671 (1983).
  3. R. H. Swendsen and J.-S. Wang, Replica monte carlo simulation of spin-glasses, Phys. Rev. Lett. 57, 2607 (1986).
  4. A. Perdomo-Ortiz, S. E. Venegas-Andraca, and A. Aspuru-Guzik, A study of heuristic guesses for adiabatic quantum computation, Quantum Information Processing 10, 33 (2011).
  5. T. Albash and M. Kowalsky, Diagonal catalysts in quantum adiabatic optimization, Phys. Rev. A 103, 022608 (2021).
  6. P. Díez-Valle, D. Porras, and J. J. García-Ripoll, Quantum variational optimization: The role of entanglement and problem hardness, Phys. Rev. A 104, 062426 (2021).
  7. A. Tanaka, A. Tomiya, and K. Hashimoto, Deep Learning and Physics (Springer Singapore, 2023).
  8. D. Wu, L. Wang, and P. Zhang, Solving statistical mechanics using variational autoregressive networks, Phys. Rev. Lett. 122, 080602 (2019).
  9. M. Gabrié, G. M. Rotskoff, and E. Vanden-Eijnden, Adaptive monte carlo augmented with normalizing flows, Proceedings of the National Academy of Sciences 119, e2109420119 (2022).
  10. D. Wu, R. Rossi, and G. Carleo, Unbiased monte carlo cluster updates with autoregressive neural networks, Phys. Rev. Res. 3, L042024 (2021).
  11. A. van den Oord, N. Kalchbrenner, and K. Kavukcuoglu, Pixel recurrent neural networks, in Proceedings of The 33rd International Conference on Machine Learning, Vol. 48 (2016) pp. 1747–1756.
  12. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, 2016).
  13. A. Mallasto, G. Montúfar, and A. Gerolin, How well do wgans estimate the wasserstein metric? (2020), arXiv:1910.03875 .
  14. L. Banchi, J. Pereira, and S. Pirandola, Generalization in quantum machine learning: A quantum information standpoint, PRX Quantum 2, 040321 (2021).
  15. T. Elsken, J. H. Metzen, and F. Hutter, Neural architecture search: A survey, Journal of Machine Learning Research 20, 1 (2019).
  16. M. Wistuba, A. Rawat, and T. Pedapati, A survey on neural architecture search (2019), arXiv:1905.01392 .
  17. T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, in International Conference on Learning Representations (2017).
  18. D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass, Phys. Rev. Lett. 35, 1792 (1975).
  19. V. Vanchurin, Toward a theory of machine learning, Machine Learning: Science and Technology 2, 035012 (2021).
  20. D. Panchenko, The sherrington-kirkpatrick model: An overview, Journal of Statistical Physics 149, 362 (2012).
  21. D. Panchenko, The Sherrington-Kirkpatrick Model (Springer, 2013).
  22. http://spinglass.uni-bonn.de/.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com