Phase Transitions in the Output Distribution of Large Language Models (2405.17088v1)
Abstract: In a physical system, changing parameters such as temperature can induce a phase transition: an abrupt change from one state of matter to another. Analogous phenomena have recently been observed in LLMs. Typically, the task of identifying phase transitions requires human analysis and some prior understanding of the system to narrow down which low-dimensional properties to monitor and analyze. Statistical methods for the automated detection of phase transitions from data have recently been proposed within the physics community. These methods are largely system agnostic and, as shown here, can be adapted to study the behavior of LLMs. In particular, we quantify distributional changes in the generated output via statistical distances, which can be efficiently estimated with access to the probability distribution over next-tokens. This versatile approach is capable of discovering new phases of behavior and unexplored transitions -- an ability that is particularly exciting in light of the rapid development of LLMs and their emergent capabilities.
- Phase transitions in machine learning. Cambridge University Press, 2011.
- James P Sethna. Statistical mechanics: entropy, order parameters, and complexity, volume 14. Oxford University Press, USA, 2021.
- Lars Onsager. Crystal Statistics. i. A Two-Dimensional Model with an Order-Disorder Transition. Phys. Rev., 65:117–149, Feb 1944.
- Louis Taillefer. Scattering and pairing in cuprate superconductors. Annu. Rev. Condens. Matter Phys., 1(1):51–70, 2010.
- Measurement-induced topological entanglement transitions in symmetric random quantum circuits. Nature Physics, 17(3):342–347, 2021.
- Novel type of phase transition in a system of self-driven particles. Phys. Rev. Lett., 75:1226–1229, Aug 1995.
- Opening the black box of deep neural networks via information. arXiv:1703.00810, 2017.
- Gradient descent happens in a tiny subspace. arXiv:1812.04754, 2018.
- Acquisition of chess knowledge in alphazero. Proceedings of the National Academy of Sciences, 119(47):e2206625119, 2022.
- The effects of reward misspecification: Mapping and mitigating misaligned models. arXiv:2201.03544, 2022.
- Exact phase transitions in deep learning. arXiv:2205.12510, 2022.
- On the stepwise nature of self-supervised learning. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 31852–31876. PMLR, 23–29 Jul 2023.
- A phase transition between positional and semantic learning in a solvable model of dot-product attention. arXiv:2402.03902, 2024.
- Pretraining task diversity and the emergence of non-bayesian in-context learning for regression. Advances in Neural Information Processing Systems, 36, 2024.
- Exponential expressivity in deep neural networks through transient chaos. Advances in neural information processing systems, 29, 2016.
- Absorbing phase transitions in artificial deep neural networks. arXiv:2307.02284, 2023.
- A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018.
- In-context learning and induction heads. arXiv:2209.11895, 2022.
- Predictability and surprise in large generative models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 1747–1764, 2022.
- Program synthesis with large language models. arXiv:2108.07732, 2021.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Measuring massive multitask language understanding. arXiv:2009.03300, 2020.
- Learning transferable visual models from natural language supervision. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8748–8763. PMLR, 18–24 Jul 2021.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv:2206.04615, 2022.
- Emergent abilities of large language models. arXiv:2206.07682, 2022.
- Scaling language models: Methods, analysis & insights from training gopher. arXiv:2112.11446, 2021.
- Broken neural scaling laws. arXiv:2210.14891, 2022.
- Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proceedings of the National Academy of Sciences, 116(32):15849–15854, 2019.
- Deep double descent: Where bigger models and more data hurt. Journal of Statistical Mechanics: Theory and Experiment, 2021(12):124003, 2021.
- Grokking: Generalization beyond overfitting on small algorithmic datasets. arXiv:2201.02177, 2022.
- Towards understanding grokking: An effective theory of representation learning. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 34651–34663. Curran Associates, Inc., 2022.
- Omnigrok: Grokking beyond algorithmic data. In The Eleventh International Conference on Learning Representations, 2022.
- Grokking in linear estimators–a solvable model that groks without understanding. arXiv:2310.16441, 2023.
- Progress measures for grokking via mechanistic interpretability. arXiv:2301.05217, 2023.
- Droplets of good representations: Grokking as a first order phase transition in two layer networks. arXiv:2310.03789, 2023.
- Sudden drops in the loss: Syntax acquisition, phase transitions, and simplicity bias in mlms. arXiv:2309.07311, 2023.
- The slingshot mechanism: An empirical study of adaptive optimizers and the grokking phenomenon. arXiv:2206.04817, 2022.
- Critical learning periods in deep neural networks. arXiv:1711.08856, 2017.
- Scaling laws for neural language models. arXiv:2001.08361, 2020.
- Analyzing and interpreting neural networks for nlp: A report on the first blackboxnlp workshop. Natural Language Engineering, 25(4):543–557, 2019.
- Chris Olah. Mechanistic interpretability, variables, and the importance of interpretable bases. https://www.transformer-circuits.pub/2022/mech-interp-essay. Online; accessed 15 April 2024.
- Interpretability in the wild: a circuit for indirect object identification in gpt-2 small. arXiv:2211.00593, 2022.
- The clock and the pizza: Two stories in mechanistic explanation of neural networks. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 27223–27250. Curran Associates, Inc., 2023.
- Deep learning scaling is predictable, empirically. arXiv:1712.00409, 2017.
- A constructive prediction of the generalization error across scales. arXiv:1909.12673, 2019.
- Scaling laws for autoregressive generative modeling. arXiv:2010.14701, 2020.
- Data and parameter scaling laws for neural machine translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5915–5922, 2021.
- Scaling vision transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12104–12113, 2022.
- Training compute-optimal large language models. arXiv:2203.15556, 2022.
- Toward transparent ai: A survey on interpreting the inner structures of deep neural networks. In 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 464–483. IEEE, 2023.
- Towards automated circuit discovery for mechanistic interpretability. Advances in Neural Information Processing Systems, 36:16318–16352, 2023.
- Are emergent abilities of large language models a mirage? In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 55565–55581. Curran Associates, Inc., 2023.
- Lei Wang. Discovering phase transitions with unsupervised learning. Phys. Rev. B, 94:195105, Nov 2016.
- Machine learning phases of matter. Nat. Phys., 13(5):431–434, 2017.
- Learning phase transitions by confusion. Nat. Phys., 13(5):435–439, 2017.
- Sebastian J. Wetzel. Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders. Phys. Rev. E, 96:022140, Aug 2017.
- Machine learning of explicit order parameters: From the Ising model to SU(2) lattice gauge theory. Phys. Rev. B, 96:184410, Nov 2017.
- Yi Zhang and Eun-Ah Kim. Quantum Loop Topography for Machine Learning. Phys. Rev. Lett., 118:216401, May 2017.
- Machine learning quantum phases of matter beyond the fermion sign problem. Sci. Rep., 7(1):1–10, 2017.
- Machine Learning Phases of Strongly Correlated Fermions. Phys. Rev. X, 7:031038, Aug 2017.
- Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination. Phys. Rev. E, 95:062122, Jun 2017.
- Machine learning out-of-equilibrium phases of matter. Phys. Rev. Lett., 120:257204, Jun 2018.
- Machine learning and the physical sciences. Rev. Mod. Phys., 91:045002, Dec 2019.
- Vector field divergence of predictive model output as indication of phase transitions. Phys. Rev. E, 99:062107, Jun 2019.
- Identifying quantum phase transitions using artificial neural networks on experimental data. Nat. Phys., 15(9):917–920, 2019.
- Unsupervised identification of topological phase transitions using predictive models. New J. Phys., 22(4):045003, apr 2020.
- Juan Carrasquilla. Machine learning for quantum matter. Adv. Phys.: X, 5(1):1797528, 2020.
- Unsupervised phase discovery with deep anomaly detection. Phys. Rev. Lett., 125:170603, Oct 2020.
- Interpretable and unsupervised phase classification. Phys. Rev. Res., 3:033052, Jul 2021.
- Unsupervised machine learning of topological phase transitions from experimental data. Mach. Learn.: Sci. Technol., 2021.
- Modern applications of machine learning in quantum sciences. arXiv:2204.04198, 2022.
- Replacing neural networks by optimal analytical predictors for the detection of phase transitions. Phys. Rev. X, 12:031044, Sep 2022.
- Mapping out phase diagrams with generative classifiers. arXiv:2306.14894, 2023.
- Fast detection of phase transitions with multi-task learning-by-confusion. arXiv:2311.09128, 2023.
- Machine learning phase transitions: Connections to the fisher information. arXiv:2311.10710, 2023.
- Gpt-4 technical report. arXiv:2303.08774, 2023.
- Anthropic. Model Card and Evaluations for Claude Models. https://www-cdn.anthropic.com/bd2a28d2535bfb0494cc8e2a3bf135d2e7523226/Model-Card-Claude-2.pdf. Online; accessed 15 April 2024.
- Gemini: a family of highly capable multimodal models. arXiv:2312.11805, 2023.
- The quantization model of neural scaling. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 28699–28722. Curran Associates, Inc., 2023.
- A theory for emergence of complex skills in language models. arXiv:2307.15936, 2023.
- Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR, 2023.
- Mistral 7b. arXiv:2310.06825, 2023.
- AI@Meta. Llama 3 Model Card. https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md. Online; accessed 21 May 2024.
- On divergences and informations in statistics and information theory. IEEE Trans. Inf. Theory, 52(10):4394–4412, 2006.
- Ye-Hua Liu and Evert P. L. van Nieuwenburg. Discriminative Cooperative Networks for Detecting Phase Transitions. Phys. Rev. Lett., 120:176401, Apr 2018.
- Machine learning vortices at the Kosterlitz-Thouless transition. Phys. Rev. B, 97:045207, Jan 2018.
- Parameter diagnostics of phases and phase transition learning by neural networks. Phys. Rev. B, 97:174435, May 2018.
- Confusion scheme in machine learning detects double phase transitions and quasi-long-range order. Phys. Rev. E, 99:043308, Apr 2019.
- Machine learning dynamical phase transitions in complex networks. Phys. Rev. E, 100:052312, Nov 2019.
- Learning epidemic threshold in complex networks by convolutional neural network. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29(11), 2019.
- Reveal flocking of birds flying in fog by machine learning. arXiv:2005.10505, 2020.
- Revealing quantum chaos with machine learning. Phys. Rev. B, 101:064406, Feb 2020.
- Analyzing Nonequilibrium Quantum States through Snapshots with Artificial Neural Networks. Phys. Rev. Lett., 127:150504, Oct 2021.
- Exploring neural network training strategies to determine phase transitions in frustrated magnetic models. Computational Materials Science, 198:110702, 2021.
- Learning by confusion approach to identification of discontinuous phase transitions. Phys. Rev. E, 108:024113, Aug 2023.
- Learning entanglement breakdown as a phase transition by confusion. New J. Phys., 24(7):073045, aug 2022.
- Machine learning of phase transitions in nonlinear polariton lattices. Commun. Phys., 5(1):8, 2022.
- Neural network topological snake models for locating general phase diagrams. arXiv:2205.09699, 2022.
- Fluctuation based interpretable analysis scheme for quantum many-body snapshots. arXiv:2304.06029, 2023.
- W. Guo and Liang He. Learning phase transitions from regression uncertainty: a new regression-based machine learning approach for automated detection of phases of matter. New J. Phys., 25(8):083037, aug 2023.
- Classical analog of quantum models in synthetic dimensions. Phys. Rev. A, 109:013303, Jan 2024.
- Quantum chi-squared tomography and mutual information testing. arXiv:2305.18519, 2023.
- Information geometry of divergence functions. Bulletin of the polish academy of sciences. Technical sciences, 58(1):183–195, 2010.
- Fidelity, dynamic structure factor, and susceptibility in critical phenomena. Phys. Rev. E, 76:022101, Aug 2007.
- Shi-Jian Gu. Fidelity approach to quantum phase transitions. Int. J. Mod. Phys. B, 24(23):4371–4458, 2010.
- Relating fisher information to order parameters. Phys. Rev. E, 84:041116, Oct 2011.
- Chatqa: Surpassing gpt-4 on conversational qa and rag. arXiv preprint arXiv:2401.10225, 2024.
- Freeman J. Dyson. Existence of a phase transition in a one-dimensional Ising ferromagnet. Commun. Math. Phys., 12:91–107, 1969.
- Critical temperature of one-dimensional ising model with long-range interaction revisited. Physica A: Statistical Mechanics and its Applications, 596:127136, 2022.
- Sebastián Bahamondes. Study of the possibility of phase transitions in LLMs. https://community.wolfram.com/groups/-/m/t/2958851?p_p_auth=PI4XRS4b. Online; accessed 10 April 2024.
- OpenWebText Corpus. https://Skylion007.github.io/OpenWebTextCorpus, 2019.
- Concepts in Thermal Physics. Oxford University Press, 10 2009.
- Probing across time: What does roberta know and when? arXiv:2104.07885, 2021.
- Language models represent space and time. arXiv:2310.02207, 2023.
- Beren Millidge. Basic facts about language models during training. https://www.alignmentforum.org/posts/2JJtxitp6nqu6ffak/basic-facts-about-language-models-during-training-1. Online; accessed 12 April 2024.
- Ian Goodfellow. NIPS 2016 Tutorial: Generative Adversarial Networks. arXiv:1701.00160, 2016. pp. 9 – 17.
- Linking losses for density ratio and class-probability estimation. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 304–313, New York, New York, USA, 20–22 Jun 2016. PMLR.