Hebbian Learning from First Principles (2401.07110v2)
Abstract: Recently, the original storage prescription for the Hopfield model of neural networks -- as well as for its dense generalizations -- has been turned into a genuine Hebbian learning rule by postulating the expression of its Hamiltonian for both the supervised and unsupervised protocols. In these notes, first, we obtain these explicit expressions by relying upon maximum entropy extremization `a la Jaynes. Beyond providing a formal derivation of these recipes for Hebbian learning, this construction also highlights how Lagrangian constraints within entropy extremization force network's outcomes on neural correlations: these try to mimic the empirical counterparts hidden in the datasets provided to the network for its training and, the denser the network, the longer the correlations that it is able to capture. Next, we prove that, in the big data limit, whatever the presence of a teacher (or its lacking), not only these Hebbian learning rules converge to the original storage prescription of the Hopfield model but also their related free energies (and, thus, the statistical mechanical picture provided by Amit, Gutfreund and Sompolinsky is fully recovered). As a sideline, we show mathematical equivalence among standard Cost functions (Hamiltonian), preferred in Statistical Mechanical jargon, and quadratic Loss Functions, preferred in Machine Learning terminology. Remarks on the exponential Hopfield model (as the limit of dense networks with diverging density) and semi-supervised protocols are also provided.
- Dense Hebbian neural networks: a replica symmetric picture of supervised learning. Physica A: Statistical Mechanics and its Applications, 626:129076, 2023.
- Dense Hebbian neural networks: A replica symmetric picture of unsupervised learning. Physica A: Statistical Mechanics and its Applications, 627:129143, 2023.
- Replica symmetry breaking in neural networks: A few steps toward rigorous results. Journal of Physics A: Mathematical and Theoretical, 53, 2020.
- Neural networks with a redundant representation: Detecting the undetectable. Physical Review Letters, 124:28301, 2020.
- The emergence of a concept in shallow neural networks. Neural Networks, 148:232–253, 2022.
- Generalized Guerra’s interpolation schemes for dense associative neural networks. Neural Networks, 128:254–267, 2020.
- Machine learning and statistical physics: theory, inspiration, application. J. Phys. A: Math. and Theor., Special, 2020.
- Notes on the p-spin glass studied via Hamilton-Jacobi and smooth-cavity techniques. Journal of Mathematical Physics, 53:1–36, 2012.
- E. Agliari and G. De Marzo. Tolerance versus synaptic noise in dense associative memories. European Physical Journal Plus, 135, 2020.
- Replica symmetry breaking in dense Hebbian neural networks. J. Stat. Phys., 189(2):1–41, 2022.
- Supervised Hebbian learning. Europhysics Letters, 141:11001, 2023.
- D. J. Amit. Modeling brain function: The world of attractor neural networks. Cambridge University Press, 1989.
- Storing infinite numbers of patterns in a spin-glass model of neural networks. Physical Review Letters, 55:1530–1533, 1985.
- Statistical mechanics of deep learning. Annual Review of Condensed Matter Physics, 11:1, 2020.
- P. Baldi and S. S. Venkatesh. Number of stable points for spin-glasses and neural networks of higher orders. Physical Review Letters, 58:913, 1987.
- The capacity of the dense associative memory networks. Neurocomputing, 469:198–208, 2022.
- J. Barbier and N. Macris. The adaptive interpolation method for proving replica formulas. Applications to the Curie–Weiss and wigner spike models. J. Phys. A: Math. &\&& Theor., 52:294002, 2019.
- The replica symmetric approximation of the analogical neural network. Journal of Statistical Physics, 140:784–796, 2010.
- W. Bialek. Biophysics: searching for principles. Princeton University Press, 2012.
- C. Bishop. Pattern recognition and machine learning. Springer Press (New York), 2006.
- P. Bojanowski and A. Joulin. Unsupervised learning by predicting noise. In International Conference on Machine Learning, pages 517–526. PMLR, 2017.
- A. Bovier and B. Niederhauser. The spin-glass phase-transition in the Hopfield model with p-spin interactions. Advances in Theoretical and Mathematical Physics, 5:1001–1046, 8 2001.
- Dynamical maximum entropy approach to flocking. Physical Review E, 89(4):042707, 2014.
- Theory of neural information processing systems. OUP Oxford, 2005.
- On a model of associative memory with huge storage capacity. Journal of Statistical Physics, 168:288–299, 2017.
- A. Engel and C. Van den Broeck. Statistical mechanics of learning. Cambridge University Press, 2001.
- K. Fischer and J. Hertz. Spin Glasses. Cambridge University Press, 1993.
- E. Gardner. Multiconnected neural network models. Journal of Physics A: General Physics, 20, 1987.
- F. Guerra. Broken replica symmetry bounds in the mean field spin glass model. Communications in Mathematical Physics, 233:1–12, 2003.
- J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America, 79:2554–2558, 1982.
- H. Huang. Statistical Mechanics of Neural Networks. Springer press, 2021.
- E. T. Jaynes. Information theory and statistical mechanics. Physical Review, 106:620–630, 1957.
- D. Krotov. A new frontier for hopfield networks. Nature Review Physics, (5):366–367, 2023.
- D. Krotov and J. Hopfield. Dense associative memory is robust to adversarial inputs. Neural Computation, 30:3151–3167, 2018.
- D. Krotov and J. Hopfield. Large associative memory problem in neurobiology and machine learning. arXiv, page 2008.06996, 2020.
- D. Krotov and J. J. Hopfield. Dense associative memory for pattern recognition. Advances in Neural Information Processing Systems, pages 1180–1188, 2016.
- Y. LeCun and etl. al. A tutorial on energy-based learning. Predicting structured data, 1:10, 2006.
- C. Lucibello and M. Mezard. The exponential capacity of dense associative memories. arXiv, (2304):2304.14964, 2023.
- Exactly solvable statistical physics models for large neuronal populations. arXiv, (2310):2310.10860, 2023.
- Spin glass theory and beyond: an introduction to the Replica Method and its applications, volume 9. World Scientific Publishing Company, 1987.
- Maximum entropy models for antibody diversity. Proceedings of the National Academy of Sciences, 107(12):5405–5410, 2010.
- H. Nishimori. Statistical physics of spin glasses and information processing: an introduction. Claredon Press, 2001.
- G. Parisi. Infinite number of order parameters for spin-glasses. Physical Review Letters, 43(23):1954, 1979.
- G. Parisi. Nobel lecture: Multiple equilibria. Reviews of Modern Physics, 95.3:030501, 2023.
- Hopfield networks is all you need. arXiv preprint arXiv:2008.02217, 2020.
- Weak pairwise correlations imply strongly correlated network states in a neural population. Nature, 440(7087):1007, 2006.
- Statistical mechanics of learning from examples. Physical review A, 45(8):1992, 6056.
- The simplest maximum entropy model for collective behavior in a neural network. Journal of Statistical Mechanics: Theory and Experiment, 2013(03):P03011, 2013.
- H. Tuckwell. Introduction to theoretical neurobiology. Cambridge University Press, 1988.
- J. Van Engelen and H. Holger. Information theory and statistical mechanics. Machine Learning, 109:373–440, 2020.