Grokking in the Ising Model (2510.25966v1)
Abstract: Delayed generalization, termed grokking, in a machine learning calculation occurs when the training accuracy approaches its maximum value long before the test accuracy. This paper examines grokking in the context of a neural network trained to classify 2D Ising model configurations.. We find, partially with the aid of novel PCA-based network layer analysis techniques, that the grokking behavior can be qualitatively interpreted as a phase transition in the neural network in which the fully connected network transforms into a relatively sparse subnetwork. This in turn reduces the confusion associated with a multiplicity of paths. The network can then identify the common features of the input classes and hence generalize to the recognition of previously unseen patterns.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.