Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 67 tok/s

Gemini 2.5 Pro 56 tok/s Pro

GPT-5 Medium 25 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 96 tok/s Pro

Kimi K2 209 tok/s Pro

GPT OSS 120B 429 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

Testing the spin-bath view of self-attention: A Hamiltonian analysis of GPT-2 Transformer (2507.00683v1)

Published 1 Jul 2025 in cond-mat.mtrl-sci and cs.LG

Abstract: The recently proposed physics-based framework by Huo and Johnson~\cite{huo2024capturing} models the attention mechanism of LLMs as an interacting two-body spin system, offering a first-principles explanation for phenomena like repetition and bias. Building on this hypothesis, we extract the complete Query-Key weight matrices from a production-grade GPT-2 model and derive the corresponding effective Hamiltonian for every attention head. From these Hamiltonians we obtain analytic \textit{phase boundaries} logit gap criteria that predict which token should dominate the next-token distribution for a given context. A systematic evaluation on 144 heads across 20 factual-recall prompts reveals a strong negative correlation between the theoretical logit gaps and the model's empirical token rankings ($r\approx-0.70$, $p<10^{{-3}$).Targeted} ablations further show that suppressing the heads most aligned with the spin-bath predictions induces the anticipated shifts in output probabilities, confirming a causal link rather than a coincidental association. Taken together, our findings provide the first strong empirical evidence for the spin-bath analogy in a production-grade model. This validation not only furnishes a tractable, physics-inspired lens for interpretability but also provides the groundwork for novel generative models, bridging the gap between theoretical condensed matter physics and AI.