Recovering Mental Representations from Large Language Models with Markov Chain Monte Carlo (2401.16657v1)
Abstract: Simulating sampling algorithms with people has proven a useful method for efficiently probing and understanding their mental representations. We propose that the same methods can be used to study the representations of LLMs. While one can always directly prompt either humans or LLMs to disclose their mental representations introspectively, we show that increased efficiency can be achieved by using LLMs as elements of a sampling algorithm. We explore the extent to which we recover human-like representations when LLMs are interrogated with Direct Sampling and Markov chain Monte Carlo (MCMC). We found a significant increase in efficiency and performance using adaptive sampling algorithms based on MCMC. We also highlight the potential of our method to yield a more general method of conducting Bayesian inference \textit{with} LLMs.
- “Understanding intermediate layers using linear classifier probes” In arXiv preprint arXiv:1610.01644, 2016
- A.A. Barker “Monte Carlo Calculations of the Radial Distribution Functions for a Proton-Electron Plasma” In Australian Journal of Physics 18.2 CSIRO PUBLISHING, 1965, pp. 119–134 DOI: 10.1071/ph650119
- Yonatan Belinkov “Probing classifiers: Promises, shortcomings, and advances” In Computational Linguistics 48.1 MIT Press One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA …, 2022, pp. 207–219
- Michael Betancourt “A conceptual introduction to Hamiltonian Monte Carlo” In arXiv preprint arXiv:1701.02434, 2017
- “Sparks of artificial general intelligence: Early experiments with GPT-4” In arXiv preprint arXiv:2303.12712, 2023
- Andrew Gelman and Donald B. Rubin “Inference from Iterative Simulation Using Multiple Sequences” In Statistical Science 7.4 Institute of Mathematical Statistics, 1992, pp. 457–472 DOI: 10.1214/ss/1177011136
- “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images” In IEEE Transactions on pattern analysis and machine intelligence IEEE, 1984, pp. 721–741
- “Church: a language for generative models” In arXiv preprint arXiv:1206.3255, 2012
- “Gibbs Sampling with People” In Advances in Neural Information Processing Systems 33, 2020, pp. 10659–10671
- Matthew D Hoffman and Andrew Gelman “The No-U-Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo.” In Journal Machine Learning Research 15.1, 2014, pp. 1593–1623
- “The world color survey” CSLI Publications Stanford, CA, 2009
- “Similarity of neural network representations revisited” In International Conference on Machine Learning, 2019, pp. 3519–3529 PMLR
- Yann LeCun, Yoshua Bengio and Geoffrey Hinton “Deep learning” In Nature 521.7553 Nature Publishing Group UK London, 2015, pp. 436–444
- Jay B. Martin, Thomas L. Griffiths and Adam N. Sanborn “Testing the Efficiency of Markov Chain Monte Carlo With People Using Facial Affect Categories” In Cognitive Science 36.1, 2012, pp. 150–162
- “GPT is an effective tool for multilingual psychological text analysis” In PsyArXiv, 2023
- “Markov Chain Monte Carlo with People” In Advances in Neural Information Processing Systems 20, 2007
- A.N. Sanborn, T.L. Griffiths and R.M. Shiffrin “Uncovering mental representations with Markov chain Monte Carlo” In Cognitive Psychology 60.2, 2010, pp. 63–106
- Roger N Shepard and Phipps Arabie “Additive clustering: Representation of similarities as combinations of discrete overlapping properties.” In Psychological Review 86.2 American Psychological Association, 1979, pp. 87–123
- Warren S Torgerson “Theory and methods of scaling.” New York: Wiley, 1958
- “Attention is all you need” In Advances in Neural Information Processing Systems 30, 2017
- “From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought” In arXiv preprint arXiv:2306.12672, 2023
- “Grounded physical language understanding with probabilistic programs and simulated worlds” In Proceedings of the Annual Conference of the Cognitive Science Society, 2023
- “Deep de Finetti: Recovering Topic Distributions from Large Language Models” In arXiv preprint arXiv:2312.14226, 2023