Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sparse and Structured Hopfield Networks (2402.13725v2)

Published 21 Feb 2024 in cs.LG

Abstract: Modern Hopfield networks have enjoyed recent interest due to their connection to attention in transformers. Our paper provides a unified framework for sparse Hopfield networks by establishing a link with Fenchel-Young losses. The result is a new family of Hopfield-Fenchel-Young energies whose update rules are end-to-end differentiable sparse transformations. We reveal a connection between loss margins, sparsity, and exact memory retrieval. We further extend this framework to structured Hopfield networks via the SparseMAP transformation, which can retrieve pattern associations instead of a single pattern. Experiments on multiple instance learning and text rationalization demonstrate the usefulness of our approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Storing infinite numbers of patterns in a spin-glass model of neural networks. Physical Review Letters, 55(14):1530, 1985.
  2. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pp.  136–145. PMLR, 2017.
  3. Interpretable neural predictions with differentiable binary variables. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.  2963–2977, 2019.
  4. Learning with Fenchel-Young losses. Journal of Machine Learning Research, 21(1):1314–1382, 2020.
  5. Random point sets on the sphere—hole radii, covering, and separation. Experimental Mathematics, 27(1):62–81, 2018.
  6. Adaptively sparse transformers. In Proceedings of EMNLP-IJCNLP, 2019.
  7. Retrieving k𝑘kitalic_k-nearest memories with modern hopfield networks. In Associative Memory {normal-{\{{\normal-\\backslash\&}normal-}\}} Hopfield Networks in 2023, 2023.
  8. On a model of associative memory with huge storage capacity. Journal of Statistical Physics, 168:288–299, 2017.
  9. Spectra: Sparse structured text rationalization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.  6534–6550, 2021.
  10. Energy transformer. In Advances in Neural Information Processing Systems, 2023.
  11. Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8):2554–2558, 1982.
  12. On sparse modern hopfield model. In Advances in Neural Information Processing Systems, 2023. URL https://arxiv.org/abs/2309.12673.
  13. Attention-based deep multiple instance learning. In International Conference on Machine Learning, pp.  2127–2136. PMLR, 2018. URL https://arxiv.org/abs/1802.04712.
  14. Aligning faithful interpretations with their social attribution. Transactions of the Association for Computational Linguistics, 9:294–310, 2021.
  15. Dense associative memory for pattern recognition. In Advances in Neural Information Processing Systems, 2016.
  16. Factor graphs and the sum-product algorithm. IEEE Transactions on information theory, 47(2):498–519, 2001.
  17. Rationalizing neural predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp.  107–117, 2016.
  18. Pattern separation in the dentate gyrus and CA3 of the hippocampus. Science, 315(5814):961–966, 2007.
  19. An exponential learning rate schedule for deep learning. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=rJg8TeSFDH.
  20. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  21. From softmax to sparsemax: A sparse model of attention and multi-label classification. In Proceedings of ICML, 2016.
  22. May, R. M. Patterns of species abundance and distribution. Ecology and Evolution of Communities, pp.  81–120, 1975.
  23. Learning attitudes and attributes from multi-aspect reviews. In ICDM, 2012.
  24. The capacity of the hopfield associative memory. IEEE transactions on Information Theory, 33(4):461–482, 1987.
  25. Universal hopfield networks: A general framework for single-shot associative memory models. In International Conference on Machine Learning, pp.  15561–15583. PMLR, 2022.
  26. CA3 retrieves coherent representations from degraded input: direct evidence for CA3 pattern completion and dentate gyrus pattern separation. Neuron, 81(2):416–427, 2014.
  27. A regularized framework for sparse and structured neural attention. Advances in Neural Information Processing Systems, 30, 2017.
  28. Sparsemap: Differentiable sparse structured inference. In Proceedings of the International Conference on Machine Learning (ICML), 2018.
  29. Palm, G. Neural associative memories and sparse coding. Neural Netw, 37:165–171, 2013.
  30. Sparse sequence-to-sequence models. In Proceedings of ACL, 2019.
  31. Hopfield networks is all you need. In Proceedings of ICLR, 2021. URL https://openreview.net/forum?id=tL89RnzIiCd.
  32. A combinatorial model for dentate gyrus sparse coding. Neural Computation, 29(1):94–117, 2017.
  33. Natural image statistics and neural representation. Annual Review of Neuroscience, 24(1):1193–1216, 2001.
  34. Max-margin markov networks. In Advances in Neural Information Processing Systems, 2003.
  35. Tsallis, C. Possible generalization of boltzmann-gibbs statistics. Journal of Statistical Physics, 52:479–487, 1988.
  36. Schemas and memory consolidation. Science, 316(5821):76–82, 2007. doi: 10.1126/science.1135935.
  37. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6(9), 2005.
  38. Biological learning in key-value memory networks. In Advances in Neural Information Processing Systems, 2021.
  39. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2):1–305, 2008.
  40. Relating transformers to models and neural representations of the hippocampal formation. In Proceedings of ICLR, 2021.
  41. STanhop: Sparse tandem hopfield model for memory-enhanced time series prediction. In Proceedings of ICLR, 2024. URL https://openreview.net/forum?id=6iwg437CZs.
  42. Pattern separation in the hippocampus. Trends Neurosci, 34(10):515–525, 2011.
  43. The concave-convex procedure. Neural computation, 15(4):915–936, 2003.
  44. Explain and predict, and then predict again. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets