2000 character limit reached
Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases (2306.12567v1)
Published 21 Jun 2023 in cs.CL and cs.AI
Abstract: This paper investigates whether current LLMs exhibit biases in logical reasoning, similar to humans. Specifically, we focus on syllogistic reasoning, a well-studied form of inference in the cognitive science of human deduction. To facilitate our analysis, we introduce a dataset called NeuBAROCO, originally designed for psychological experiments that assess human logical abilities in syllogistic reasoning. The dataset consists of syllogistic inferences in both English and Japanese. We examine three types of biases observed in human syllogistic reasoning: belief biases, conversion errors, and atmosphere effects. Our findings demonstrate that current LLMs struggle more with problems involving these three types of biases.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Nick Chater and Mike Oaksford. 1999. The probability heuristics model of syllogistic reasoning. Cognitive Psychology, 38(2):191–258.
- Language models show human-like content effects on reasoning. arXiv preprint arXiv:2207.07051.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Jonathan St.B. T. Evans. 1989. Bias in Human Reasoning: Causes and Consequences. Lawrence Erlbaum Associates, Inc.
- Human Reasoning: The Psychology of Deduction. Psychology Press.
- Bart Geurts. 2003. Reasoning with quantifiers. Cognition, 86(3):223–251.
- Dissociation of mechanisms underlying syllogistic reasoning. Neuroimage, 12(5):504–514.
- Sangeet Khemlani and Philip N Johnson-Laird. 2012. Theories of the syllogism: A meta-analysis. Psychological Bulletin, 138(3):427.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
- RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692.
- Lawrence S. Moss. 2015. Natural logic. In Shalom Lappin and Chris Fox, editors, The Handbook of Contemporary Semantic Theory, 2 edition, pages 559–592. Wiley.
- Stephen E. Newstead. 1989. Interpretational errors in syllogistic reasoning. Journal of Memory and Language, 28(1):78–91.
- The source of belief bias effects in syllogistic reasoning. Cognition, 45(3):257–284.
- Probing natural language inference models through semantic fragments. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8713–8721.
- Víctor Sánchez Valencia. 1991. Studies on Natural Logic and Categorial Grammar. Ph.D. thesis, University of Amsterdam.
- Yuri Sato and Koji Mineshima. 2015. How diagrams can support syllogistic reasoning: an experimental study. Journal of Logic, Language and Information, 24:409–455.
- Can transformers reason in fragments of natural language? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11184–11199, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- A behavioral genetic study of syllogism solving using linguistic and graphical representations: a preliminary report. In Images and Reasoning, pages 69–85. Keio University Press.
- Genetic factors of individual differences in decision making in economic behavior: A Japanese twin study using the Allais problem. Frontiers in Psychology, 6:1712.
- Is g an entity? a Japanese twin study using syllogisms and intelligence tests. Intelligence, 37(3):256–267.
- Keith Stenning and Michiel Van Lambalgen. 2012. Human Reasoning and Cognitive Science. MIT Press.
- Johan van Benthem. 1986. Essays in Logical Semantics. Reidel, Dordrecht.
- Minna Cheves Wilkins. 1928. The effect of changed material on ability to do formal syllogistic reasoning. Archives of Psychology, 16:1–83.
- A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122, New Orleans, Louisiana. Association for Computational Linguistics.
- Robert S Woodworth and Saul B. Sells. 1935. An atmosphere effect in formal syllogistic reasoning. Journal of Experimental Psychology, 18(4):451–460.
- Can neural networks understand monotonicity reasoning? In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 31–40.
- Risako Ando (2 papers)
- Takanobu Morishita (2 papers)
- Hirohiko Abe (2 papers)
- Koji Mineshima (20 papers)
- Mitsuhiro Okada (9 papers)