Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning (2402.14856v2)

Published 20 Feb 2024 in cs.CL and cs.AI

Abstract: Deductive reasoning plays a pivotal role in the formulation of sound and cohesive arguments. It allows individuals to draw conclusions that logically follow, given the truth value of the information provided. Recent progress in the domain of LLMs has showcased their capability in executing deductive reasoning tasks. Nonetheless, a significant portion of research primarily assesses the accuracy of LLMs in solving such tasks, often overlooking a deeper analysis of their reasoning behavior. In this study, we draw upon principles from cognitive psychology to examine inferential strategies employed by LLMs, through a detailed evaluation of their responses to propositional logic problems. Our findings indicate that LLMs display reasoning patterns akin to those observed in humans, including strategies like $\textit{supposition following}$ or $\textit{chain construction}$. Moreover, our research demonstrates that the architecture and scale of the model significantly affect its preferred method of reasoning, with more advanced models tending to adopt strategies more frequently than less sophisticated ones. Importantly, we assert that a model's accuracy, that is the correctness of its final conclusion, does not necessarily reflect the validity of its reasoning process. This distinction underscores the necessity for more nuanced evaluation procedures in the field.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Graph of Thoughts: Solving Elaborate Problems with Large Language Models. ArXiv:2308.09687 [cs].
  2. Marcel Binz and Eric Schulz. 2023. Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6):e2218523120. ArXiv:2206.14576 [cs].
  3. Monica Bucciarelli and P.n. Johnson-Laird. 1999. Strategies in Syllogistic Reasoning. Cognitive Science, 23(3):247–303.
  4. R. M. Byrne and S. J. Handley. 1997. Reasoning strategies for suppositional deductions. Cognition, 62(1):1–49.
  5. Language models show human-like content effects on reasoning tasks. ArXiv:2207.07051 [cs].
  6. Andrew M. Davis. 2018. Biases in Individual Decision-Making. In The Handbook of Behavioral Operations, pages 149–198. John Wiley & Sons, Ltd.
  7. A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models. ArXiv:2311.00445 [cs].
  8. Jonathan St. B. T. Evans. 1989. Bias in human reasoning: Causes and consequences. Bias in human reasoning: Causes and consequences. Lawrence Erlbaum Associates, Inc, Hillsdale, NJ, US. Pages: ix, 145.
  9. Gerd Gigerenzer and Peter M. Todd. 1999. Simple heuristics that make us smart. Simple heuristics that make us smart. Oxford University Press, New York, NY, US. Pages: xv, 416.
  10. Keith J. Holyoak and Robert G. Morrison. 2005. The Cambridge Handbook of Thinking and Reasoning. Cambridge University Press. Google-Books-ID: znbkHaC8QeMC.
  11. Jie Huang and Kevin Chen-Chuan Chang. 2023. Towards Reasoning in Large Language Models: A Survey. ArXiv:2212.10403 [cs].
  12. Patrick J. Hurley. 2011. A Concise Introduction to Logic, 11th edition edition. CENGAGE Learning Custom Publishing, Boston, MA.
  13. Mistral 7B. ArXiv:2310.06825 [cs].
  14. P. N. Johnson-Laird. 1986. Mental models: towards a cognitive science of language, inference, and consciousness. Harvard University Press, USA.
  15. Daniel Kahneman. 2012. Thinking, Fast and Slow: Daniel Kahneman, 1st edition edition. Penguin, London.
  16. Judgment under Uncertainty: Heuristics and Biases. Cambridge University Press, Cambridge.
  17. Large Language Models are Zero-Shot Reasoners. ArXiv:2205.11916 [cs].
  18. The language of prompting: What linguistic properties make a prompt successful? In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9210–9232, Singapore. Association for Computational Linguistics.
  19. Jacqueline P. Leighton. 2003. Defining and Describing Reason. In Jacqueline P. Leighton and Robert J. Sternberg, editors, The Nature of Reasoning, pages 3–11. Cambridge University Press, Cambridge.
  20. Dissociating language and thought in large language models. ArXiv:2301.06627 [cs].
  21. Melanie Mitchell and David C. Krakauer. 2023. The debate over understanding in AI’s large language models. Proceedings of the National Academy of Sciences, 120(13):e2215907120. Publisher: Proceedings of the National Academy of Sciences.
  22. Orca 2: Teaching Small Language Models How to Reason. ArXiv:2311.11045 [cs].
  23. GPT-4 Technical Report. ArXiv:2303.08774 [cs].
  24. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  25. Lance J. Rips. 1994. The Psychology of Proof: Deductive Reasoning in Human Thinking. The MIT Press.
  26. Deductive reasoning and strategies. Deductive reasoning and strategies. Lawrence Erlbaum Associates Publishers, Mahwah, NJ, US. Pages: xiv, 321.
  27. Cognitive Effects in Large Language Models. ArXiv:2308.14337 [cs].
  28. Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5.
  29. Alaina N. Talboy and Elizabeth Fuller. 2023. Challenging the appearance of machine intelligence: Cognitive bias in LLMs and Best Practices for Adoption. ArXiv:2304.01358 [cs].
  30. Gemini: A Family of Highly Capable Multimodal Models. ArXiv:2312.11805 [cs].
  31. Llama 2: Open Foundation and Fine-Tuned Chat Models. ArXiv:2307.09288 [cs].
  32. Language models are not naysayers: An analysis of language models on negation benchmarks. ArXiv:2306.08189 [cs].
  33. Zephyr: Direct Distillation of LM Alignment. ArXiv:2310.16944 [cs].
  34. Strategies in sentential reasoning. Cognitive Science, 26(4):425–468.
  35. R. S. Woodworth and S. B. Sells. 1935. An Atmosphere Effect in Formal Syllogistic Reasoning. Journal of Experimental Psychology, 18(4):451.
  36. Logical Reasoning over Natural Language as Knowledge Representation: A Survey. ArXiv:2303.12023 [cs].
  37. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. ArXiv:2305.10601 [cs].
  38. Natural Language Reasoning, A Survey. ArXiv:2303.14725 [cs].
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Philipp Mondorf (9 papers)
  2. Barbara Plank (130 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com