Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning (2402.14856v2)
Abstract: Deductive reasoning plays a pivotal role in the formulation of sound and cohesive arguments. It allows individuals to draw conclusions that logically follow, given the truth value of the information provided. Recent progress in the domain of LLMs has showcased their capability in executing deductive reasoning tasks. Nonetheless, a significant portion of research primarily assesses the accuracy of LLMs in solving such tasks, often overlooking a deeper analysis of their reasoning behavior. In this study, we draw upon principles from cognitive psychology to examine inferential strategies employed by LLMs, through a detailed evaluation of their responses to propositional logic problems. Our findings indicate that LLMs display reasoning patterns akin to those observed in humans, including strategies like $\textit{supposition following}$ or $\textit{chain construction}$. Moreover, our research demonstrates that the architecture and scale of the model significantly affect its preferred method of reasoning, with more advanced models tending to adopt strategies more frequently than less sophisticated ones. Importantly, we assert that a model's accuracy, that is the correctness of its final conclusion, does not necessarily reflect the validity of its reasoning process. This distinction underscores the necessity for more nuanced evaluation procedures in the field.
- Graph of Thoughts: Solving Elaborate Problems with Large Language Models. ArXiv:2308.09687 [cs].
- Marcel Binz and Eric Schulz. 2023. Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6):e2218523120. ArXiv:2206.14576 [cs].
- Monica Bucciarelli and P.n. Johnson-Laird. 1999. Strategies in Syllogistic Reasoning. Cognitive Science, 23(3):247–303.
- R. M. Byrne and S. J. Handley. 1997. Reasoning strategies for suppositional deductions. Cognition, 62(1):1–49.
- Language models show human-like content effects on reasoning tasks. ArXiv:2207.07051 [cs].
- Andrew M. Davis. 2018. Biases in Individual Decision-Making. In The Handbook of Behavioral Operations, pages 149–198. John Wiley & Sons, Ltd.
- A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models. ArXiv:2311.00445 [cs].
- Jonathan St. B. T. Evans. 1989. Bias in human reasoning: Causes and consequences. Bias in human reasoning: Causes and consequences. Lawrence Erlbaum Associates, Inc, Hillsdale, NJ, US. Pages: ix, 145.
- Gerd Gigerenzer and Peter M. Todd. 1999. Simple heuristics that make us smart. Simple heuristics that make us smart. Oxford University Press, New York, NY, US. Pages: xv, 416.
- Keith J. Holyoak and Robert G. Morrison. 2005. The Cambridge Handbook of Thinking and Reasoning. Cambridge University Press. Google-Books-ID: znbkHaC8QeMC.
- Jie Huang and Kevin Chen-Chuan Chang. 2023. Towards Reasoning in Large Language Models: A Survey. ArXiv:2212.10403 [cs].
- Patrick J. Hurley. 2011. A Concise Introduction to Logic, 11th edition edition. CENGAGE Learning Custom Publishing, Boston, MA.
- Mistral 7B. ArXiv:2310.06825 [cs].
- P. N. Johnson-Laird. 1986. Mental models: towards a cognitive science of language, inference, and consciousness. Harvard University Press, USA.
- Daniel Kahneman. 2012. Thinking, Fast and Slow: Daniel Kahneman, 1st edition edition. Penguin, London.
- Judgment under Uncertainty: Heuristics and Biases. Cambridge University Press, Cambridge.
- Large Language Models are Zero-Shot Reasoners. ArXiv:2205.11916 [cs].
- The language of prompting: What linguistic properties make a prompt successful? In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9210–9232, Singapore. Association for Computational Linguistics.
- Jacqueline P. Leighton. 2003. Defining and Describing Reason. In Jacqueline P. Leighton and Robert J. Sternberg, editors, The Nature of Reasoning, pages 3–11. Cambridge University Press, Cambridge.
- Dissociating language and thought in large language models. ArXiv:2301.06627 [cs].
- Melanie Mitchell and David C. Krakauer. 2023. The debate over understanding in AI’s large language models. Proceedings of the National Academy of Sciences, 120(13):e2215907120. Publisher: Proceedings of the National Academy of Sciences.
- Orca 2: Teaching Small Language Models How to Reason. ArXiv:2311.11045 [cs].
- GPT-4 Technical Report. ArXiv:2303.08774 [cs].
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Lance J. Rips. 1994. The Psychology of Proof: Deductive Reasoning in Human Thinking. The MIT Press.
- Deductive reasoning and strategies. Deductive reasoning and strategies. Lawrence Erlbaum Associates Publishers, Mahwah, NJ, US. Pages: xiv, 321.
- Cognitive Effects in Large Language Models. ArXiv:2308.14337 [cs].
- Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5.
- Alaina N. Talboy and Elizabeth Fuller. 2023. Challenging the appearance of machine intelligence: Cognitive bias in LLMs and Best Practices for Adoption. ArXiv:2304.01358 [cs].
- Gemini: A Family of Highly Capable Multimodal Models. ArXiv:2312.11805 [cs].
- Llama 2: Open Foundation and Fine-Tuned Chat Models. ArXiv:2307.09288 [cs].
- Language models are not naysayers: An analysis of language models on negation benchmarks. ArXiv:2306.08189 [cs].
- Zephyr: Direct Distillation of LM Alignment. ArXiv:2310.16944 [cs].
- Strategies in sentential reasoning. Cognitive Science, 26(4):425–468.
- R. S. Woodworth and S. B. Sells. 1935. An Atmosphere Effect in Formal Syllogistic Reasoning. Journal of Experimental Psychology, 18(4):451.
- Logical Reasoning over Natural Language as Knowledge Representation: A Survey. ArXiv:2303.12023 [cs].
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models. ArXiv:2305.10601 [cs].
- Natural Language Reasoning, A Survey. ArXiv:2303.14725 [cs].
- Philipp Mondorf (9 papers)
- Barbara Plank (130 papers)