Language Models are Bounded Pragmatic Speakers: Understanding RLHF from a Bayesian Cognitive Modeling Perspective (2305.17760v6)
Abstract: How do LLMs "think"? This paper formulates a probabilistic cognitive model called the bounded pragmatic speaker, which can characterize the operation of different variations of LLMs. Specifically, we demonstrate that LLMs fine-tuned with reinforcement learning from human feedback (Ouyang et al., 2022) embody a model of thought that conceptually resembles a fast-and-slow model (Kahneman, 2011), which psychologists have attributed to humans. We discuss the limitations of reinforcement learning from human feedback as a fast-and-slow model of thought and propose avenues for expanding this framework. In essence, our research highlights the value of adopting a cognitive probabilistic modeling approach to gain insights into the comprehension, evaluation, and advancement of LLMs.
- Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691, 2022.
- Andreas, J. Language models as agent models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 5769–5779, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics.
- Reasoning about pragmatics with neural listeners and speakers. arXiv preprint arXiv:1604.00562, 2016.
- Thinking fast and slow with deep learning and tree search. Advances in neural information processing systems, 30, 2017.
- Does the autistic child have a “theory of mind”? Cognition, 21(1):37–46, 1985.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
- Innovative bert-based reranking language models for speech recognition. In 2021 IEEE Spoken Language Technology Workshop (SLT), pp. 266–271. IEEE, 2021.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021.
- Modular networks for compositional instruction following. arXiv preprint arXiv:2010.12764, 2020.
- A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
- Predicting pragmatic reasoning in language games. Science, 336(6084):998–998, 2012.
- Unified pragmatic models for generating and following instructions. arXiv preprint arXiv:1711.04987, 2017.
- Causal abstractions of neural networks. Advances in Neural Information Processing Systems, 34:9574–9586, 2021.
- Amortized inference in probabilistic reasoning. In Proceedings of the annual meeting of the cognitive science society, volume 36, 2014.
- Pragmatic language interpretation as probabilistic inference. Trends in cognitive sciences, 20(11):818–829, 2016.
- Children’s understanding of representational change and its relation to the understanding of false belief and the appearance-reality distinction. Child development, pp. 26–37, 1988.
- Probabilistic models of cognition: Exploring representations and inductive biases. Trends in cognitive sciences, 14(8):357–364, 2010.
- Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104, 2023.
- Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
- Kahneman, D. Thinking, fast and slow. macmillan, 2011.
- Rl with kl penalties is better viewed as bayesian inference. arXiv preprint arXiv:2205.11275, 2022.
- Passive learning of active causal strategies in agents and language models. arXiv preprint arXiv:2305.16183, 2023.
- Multi-agent cooperation and the emergence of (natural) language. arXiv preprint arXiv:1612.07182, 2016.
- Levine, S. Reinforcement learning and control as probabilistic inference: Tutorial and review. arXiv preprint arXiv:1805.00909, 2018.
- Lewis, D. K. Convention: A Philosophical Study. Cambridge, MA, USA: Wiley-Blackwell, 1969.
- Contrastive decoding: Open-ended text generation as optimization. arXiv preprint arXiv:2210.15097, 2022.
- Neurologic a* esque decoding: Constrained text generation with lookahead heuristics. arXiv preprint arXiv:2112.08726, 2021.
- Dissociating language and thought in large language models: a cognitive perspective. arXiv preprint arXiv:2301.06627, 2023.
- Interactive learning from activity description. In International Conference on Machine Learning, pp. 8096–8108. PMLR, 2021.
- Lever: Learning to verify language-to-code generation with execution. arXiv preprint arXiv:2302.08468, 2023.
- OpenAI. Chatgpt. https://openai.com/blog/chatgpt, 2022.
- OpenAI. Gpt-4 technical report. 2023.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442, 2023.
- Does the chimpanzee have a theory of mind? Behavioral and brain sciences, 1(4):515–526, 1978.
- Bayesian brains without probabilities. Trends in cognitive sciences, 20(12):883–893, 2016.
- Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
- Simon, H. A. Models of man; social and rational. 1957.
- Learning to summarize with human feedback. Advances in Neural Information Processing Systems, 33:3008–3021, 2020.
- How to talk so ai will learn: Instructions, descriptions, and autonomy. Advances in Neural Information Processing Systems, 35:34762–34775, 2022.
- How to grow a mind: Statistics, structure, and abstraction. science, 331(6022):1279–1285, 2011.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- One and done? optimal decisions from very few samples. Cognitive science, 38(4):599–637, 2014.
- Voyager: An open-ended embodied agent with large language models. 2023.
- Calibrate your listeners! robust communication-based training for pragmatic speakers. arXiv preprint arXiv:2110.05422, 2021.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022.
- Learning to refer informatively by amortizing pragmatic reasoning. arXiv preprint arXiv:2006.00418, 2020.
- Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children’s understanding of deception. Cognition, 13(1):103–128, 1983.
- From word models to world models: Translating from natural language to the probabilistic language of thought. 2023.
- Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022a.
- Coder reviewer reranking for code generation. arXiv preprint arXiv:2211.16490, 2022b.
- Define, evaluate, and improve task-oriented cognitive capabilities for instruction generation models. arXiv preprint arXiv:2301.05149, 2023a.
- Large language models as commonsense knowledge for large-scale task planning. arXiv preprint arXiv:2305.14078, 2023b.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.