Language Models, Agent Models, and World Models: The LAW for Machine Reasoning and Planning (2312.05230v1)
Abstract: Despite their tremendous success in many applications, LLMs often fall short of consistent reasoning and planning in various (language, embodied, and social) scenarios, due to inherent limitations in their inference, learning, and modeling capabilities. In this position paper, we present a new perspective of machine reasoning, LAW, that connects the concepts of LLMs, Agent models, and World models, for more robust and versatile reasoning capabilities. In particular, we propose that world and agent models are a better abstraction of reasoning, that introduces the crucial elements of deliberate human-like reasoning, including beliefs about the world and other agents, anticipation of consequences, goals/rewards, and strategic planning. Crucially, LLMs in LAW serve as a backend to implement the system or its elements and hence provide the computational power and adaptability. We review the recent studies that have made relevant progress and discuss future research directions towards operationalizing the LAW framework.
- Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691, 2022.
- Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning. PNAS, 2020.
- Learning rigid dynamics with face interaction graph networks. arXiv preprint arXiv:2212.03574, 2022.
- Unsupervised state representation learning in atari. Advances in neural information processing systems, 32, 2019.
- J. Andreas. Language models as agent models. arXiv preprint arXiv:2212.01681, 2022.
- AutoGPT. Autogpt, 2022. URL https://autogpt.net.
- Action understanding as inverse planning. Cognition, 113(3):329–349, 2009.
- Rational quantitative attribution of beliefs, desires and percepts in human mentalizing. Nature Human Behaviour, 1(4):1–10, 2017.
- Mindcraft: Theory of mind modeling for situated dialogue in collaborative tasks. In Conference on Empirical Methods in Natural Language Processing, 2021.
- Simulation as an engine of physical scene understanding. PNAS, 2013.
- Safe model-based reinforcement learning with stability guarantees. Advances in neural information processing systems, 30, 2017.
- R. E. Briscoe. Mental imagery and the varieties of amodal perception. Pacific Philosophical Quarterly, 92(2):153–173, 2011.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Stylepredict: Machine theory of mind for human driver behavior from trajectories. arXiv preprint arXiv:2011.04816, 2020.
- Model-based reinforcement learning via meta-policy optimization. In Conference on Robot Learning, pages 617–629. PMLR, 2018.
- R. Coulom. Efficient selectivity and backup operators in monte-carlo tree search. In Computers and Games: 5th International Conference, CG 2006, Turin, Italy, May 29-31, 2006. Revised Papers 5, pages 72–83. Springer, 2007.
- Collaborating with language models for embodied reasoning. arXiv preprint arXiv:2302.00763, 2023.
- K. Dautenhahn. Socially intelligent robots: dimensions of human–robot interaction. Philosophical transactions of the royal society B: Biological sciences, 362(1480):679–704, 2007.
- Mind2web: Towards a generalist agent for the web. arXiv preprint arXiv:2306.06070, 2023.
- Faith and fate: Limits of transformers on compositionality. arXiv preprint arXiv:2305.18654, 2023.
- A. Ermolov and N. Sebe. Latent world models for intrinsically motivated exploration. Advances in Neural Information Processing Systems, 33:5565–5575, 2020.
- J. W. Forrester. Counterintuitive behavior of social systems. Theory and decision, 2(2):109–140, 1971.
- D. Gentner and A. L. Stevens. Mental models. Psychology Press, 2014.
- G. Gergely and G. Csibra. Teleological reasoning in infancy: The naıve theory of rational action. Trends in cognitive sciences, 7(7):287–292, 2003.
- P. J. Gmytrasiewicz and P. Doshi. A framework for sequential planning in multi-agent settings. Journal of Artificial Intelligence Research, 24:49–79, 2005.
- Pragmatic language interpretation as probabilistic inference. Trends in cognitive sciences, 20(11):818–829, 2016.
- Google. Gemini: A family of highly capable multimodal models. Technical report, Google, 2023.
- A. Gopnik and H. M. Wellman. The theory theory. In An earlier version of this chapter was presented at the Society for Research in Child Development Meeting, 1991. Cambridge University Press, 1994.
- 3dp3: 3d scene perception via probabilistic programming. Advances in Neural Information Processing Systems, 34:9600–9612, 2021.
- D. Ha and J. Schmidhuber. World models. arXiv preprint arXiv:1803.10122, 2018.
- Cooperative inverse reinforcement learning. In Advances in neural information processing systems, 2016.
- Learning latent dynamics for planning from pixels. In International conference on machine learning, pages 2555–2565. PMLR, 2019.
- Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193, 2020.
- Reasoning with Language Model is Planning with World Model. arXiv preprint arXiv:2305.14992, 2023a.
- Toolkengpt: Augmenting frozen language models with massive tools via tool embeddings. arXiv preprint arXiv:2305.11554, 2023b.
- Gaia-1: A generative world model for autonomous driving. arXiv preprint arXiv:2309.17080, 2023.
- Z. Hu and E. P. Xing. Toward a ’Standard Model’ of Machine Learning. Harvard Data Science Review, 4(4), oct 27 2022. https://hdsr.mitpress.mit.edu/pub/zkib7xth.
- Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR, 2022.
- The naïve utility calculus: Computational principles underlying commonsense psychology. Trends in cognitive sciences, 20(8):589–604, 2016.
- gradsim: Differentiable simulation for system identification and visuomotor control. arXiv preprint arXiv:2104.02646, 2021.
- Neural amortized inference for nested multi-agent reasoning. arXiv preprint arXiv:2308.11071, 2023.
- Mmtom-qa: Multimodal theory of mind question answering. In NeurIPS 2023 Foundation Models for Decision Making Workshop, 2023.
- P. N. Johnson-Laird. Mental models: Towards a cognitive science of language, inference, and consciousness. Harvard University Press, 1983.
- P. N. Johnson-Laird. Mental models and human reasoning. PNAS, 2010.
- Maieutic prompting: Logically consistent reasoning with recursive explanations. arXiv preprint arXiv:2205.11822, 2022.
- Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374, 2019.
- L. Kocsis and C. Szepesvári. Bandit based monte-carlo planning. In Machine Learning: ECML 2006: 17th European Conference on Machine Learning Berlin, Germany, September 18-22, 2006 Proceedings 17, pages 282–293. Springer, 2006.
- Reward design with language models. arXiv preprint arXiv:2303.00001, 2023.
- Y. LeCun. A path towards autonomous machine intelligence. Open Review, 2022.
- Implicit representations of meaning in neural language models. arXiv preprint arXiv:2106.00737, 2021.
- Language modeling with latent situations. arXiv preprint arXiv:2212.10012, 2022a.
- Pre-trained language models for interactive decision-making. Advances in Neural Information Processing Systems, 35:31199–31212, 2022b.
- Learning compositional koopman operators for model-based control. arXiv preprint arXiv:1910.08264, 2019.
- Visual grounding of learned physical models. In International conference on machine learning, pages 5927–5936. PMLR, 2020.
- Llm+ p: Empowering large language models with optimal planning proficiency. arXiv preprint arXiv:2304.11477, 2023a.
- Visual instruction tuning. arXiv preprint arXiv:2304.08485, 2023b.
- Training socially aligned language models in simulated human society. arXiv preprint arXiv:2305.16960, 2023c.
- Eureka: Human-level reward design via coding large language models. arXiv preprint arXiv:2310.12931, 2023.
- Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651, 2023.
- Roco: Dialectic multi-robot collaboration with large language models. arXiv preprint arXiv:2307.04738, 2023.
- Motion-dependent representation of space in area mt+. Neuron, 78(3):554–562, 2013.
- Model-based reinforcement learning: A survey. Foundations and Trends® in Machine Learning, 2023.
- Boosting theory-of-mind performance in large language models via prompting. arXiv preprint arXiv:2304.11490, 2023.
- Primary visual cortex represents the difference between past and present. Cerebral Cortex, 25(6):1427–1440, 2015.
- OpenAI. Chatgpt plugins. https://openai.com/blog/chatgpt-plugins, 2022.
- OpenAI. Gpt-4 technical report, 2023.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Structured chemistry reasoning with large language models. arXiv preprint arXiv:2311.09656, 2023.
- Art: Automatic multi-step reasoning and tool-use for large language models. arXiv preprint arXiv:2303.09014, 2023.
- Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22, 2023.
- M. Patel and S. Chernova. Proactive robot assistance via spatio-temporal object modeling. arXiv preprint arXiv:2211.15501, 2022.
- Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334, 2023.
- Evidence that the brain’s physics engine runs forward simulations of what will happen next. Journal of Vision, 20(11):1521–1521, 2020.
- D. Premack and G. Woodruff. Does the chimpanzee have a theory of mind? Behavioral and brain sciences, 1(4):515–526, 1978.
- Nopa: Neurally-guided online probabilistic assistance for building socially intelligent home assistants. arXiv preprint arXiv:2301.05223, 2023.
- Neural theory-of-mind? on the limits of social intelligence in large lms. arXiv preprint arXiv:2210.13312, 2022.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
- Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839):604–609, 2020.
- J. Schulkin. Action, perception and the brain: Adaptation and cephalic expression. Springer, 2012.
- Emergent deception and skepticism via theory of mind. In ICML 2023: First Workshop on Theory of Mind in Communicating Agents (ToM 2023), 2023.
- Symmetric machine theory of mind. In K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 19450–19466. PMLR, 17–23 Jul 2022.
- Clever hans or neural theory of mind? stress testing social reasoning in large language models. arXiv preprint arXiv:2305.14763, 2023.
- Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
- Modeling expectation violation in intuitive physics with coarse probabilistic object representations. Advances in neural information processing systems, 32, 2019.
- Core knowledge. Developmental science, 10(1):89–96, 2007.
- Cognitive architectures for language agents. arXiv preprint arXiv:2309.02427, 2023.
- Social interactions as recursive mdps. In Conference on Robot Learning, pages 949–958. PMLR, 2022.
- E. C. Tolman. Cognitive maps in rats and men. Psychological review, 55(4):189, 1948.
- M. Toussaint. Learning a world model and planning with a self-organizing, dynamic neural system. Advances in neural information processing systems, 16, 2003.
- Differentiable physics and stable modes for tool-use and manipulation planning. 2018.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
- T. Ullman. Large language models fail on trivial alterations to theory-of-mind tasks. arXiv preprint arXiv:2302.08399, 2023.
- Mind games: Game engines as an architecture for intuitive physics. Trends in cognitive sciences, 21(9):649–665, 2017.
- Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291, 2023a.
- Towards mutual theory of mind in human-ai interaction: How language reflects what students perceive about a virtual teaching assistant. In Proceedings of the 2021 CHI conference on human factors in computing systems, pages 1–14, 2021.
- Promptagent: Strategic planning with language models enables expert-level prompt optimization. arXiv preprint arXiv:2310.16427, 2023b.
- Mint: Evaluating llms in multi-turn interaction with tools and language feedback. arXiv preprint arXiv:2309.10691, 2023c.
- Describe, explain, plan and select: Interactive planning with large language models enables open-world multi-task agents. arXiv preprint arXiv:2302.01560, 2023d.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022.
- Large language models are reasoners with self-verification. arXiv preprint arXiv:2212.09561, 2022.
- From word models to world models: Translating from natural language to the probabilistic language of thought. 2023.
- Learning to see physics via visual de-animation. NeurIPS, 2017.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models. arXiv preprint arXiv:2305.10626, 2023.
- Decomposition enhances reasoning via self-evaluation guided decoding. arXiv preprint arXiv:2305.00633, 2023a.
- Translating natural language to planning goals with large-language models. arXiv preprint arXiv:2302.05128, 2023b.
- Learning interactive real-world simulators. arXiv preprint arXiv:2310.06114, 2023.
- Tree of thoughts: Deliberate problem solving with large language models. 2023a.
- ReAct: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR), 2023b.
- Language to rewards for robotic skill synthesis. arXiv preprint arXiv:2306.08647, 2023.
- Agenttuning: Enabling generalized agent abilities for llms. arXiv preprint arXiv:2310.12823, 2023.
- Building cooperative embodied agents modularly with large language models. arXiv preprint arXiv:2307.02485, 2023.
- Solar: Deep structured representations for model-based reinforcement learning. In International conference on machine learning, pages 7444–7453. PMLR, 2019.
- Online bayesian goal inference for boundedly rational planning agents. Advances in neural information processing systems, 33:19238–19250, 2020.
- Least-to-most prompting enables complex reasoning in large language models. arXiv preprint arXiv:2205.10625, 2022.
- Solving math word problem via cooperative reasoning induced language models. arXiv preprint arXiv:2210.16257, 2022.