Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? (2401.18070v2)

Published 31 Jan 2024 in cs.CL, cs.AI, and cs.LG

Abstract: There is increasing interest in employing LLMs as cognitive models. For such purposes, it is central to understand which properties of human cognition are well-modeled by LLMs, and which are not. In this work, we study the biases of LLMs in relation to those known in children when solving arithmetic word problems. Surveying the learning science literature, we posit that the problem-solving process can be split into three distinct steps: text comprehension, solution planning and solution execution. We construct tests for each one in order to understand whether current LLMs display the same cognitive biases as children in these steps. We generate a novel set of word problems for each of these tests, using a neuro-symbolic approach that enables fine-grained control over the problem features. We find evidence that LLMs, with and without instruction-tuning, exhibit human-like biases in both the text-comprehension and the solution-planning steps of the solving process, but not in the final step, in which the arithmetic expressions are executed to obtain the answer.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Andreas Opedal (11 papers)
  2. Alessandro Stolfo (12 papers)
  3. Haruki Shirakami (2 papers)
  4. Ying Jiao (19 papers)
  5. Ryan Cotterell (226 papers)
  6. Bernhard Schölkopf (412 papers)
  7. Abulhair Saparov (17 papers)
  8. Mrinmaya Sachan (124 papers)
Citations (10)

Summary

Introduction

The application of LLMs as cognitive models in education sparks valuable discussion about their capabilities to replicate certain features of human cognition. An interesting dimension of this discussion revolves around whether LLMs exhibit biases in problem-solving tasks like those observed in human learners, particularly children. A recent investigation focused on this subject, dissecting various stages of problem-solving, including text comprehension, solution planning, and solution execution.

Cognitive Modeling of LLMs

The paper's approach to cognitive modeling suggests splitting the problem-solving process into distinct steps. Using a neuro-symbolic method, the researchers constructed a series of tests corresponding to each step. Their aim was to determine whether state-of-the-art LLMs display human-like biases during these steps. The tests were insightful, reflecting human biases in both text comprehension and solution planning. However, interestingly, LLMs did not exhibit the same biases in the solution execution phase, particularly in computations involving carries.

The investigation proposes that biases at the text comprehension level may result from the influence human creators have on training datasets. These biases could be embedded in the training data and, consequently, in the models themselves. At the solution planning phase, models demonstrated a preference for problems involving dynamic changes such as state transfers over static comparisons. This echoes the behavior seen in child learners who find dynamic-state problems less challenging.

Numerical Reasoning in LLMs

Arguably the most evocative finding was the absence of biases in the execution of arithmetic expressions, specifically the lack of a "carry effect." In human cognition, carry operations are known to place a higher demand on working memory, leading to increased difficulty. The LLMs studied, however, did not show a degradation in performance when carrying was required, suggesting a significant departure from human cognitive patterns in numerical reasoning.

This contrast may highlight fundamental differences between LLM memory mechanisms and human working memory limitations. The finding also prompts questions about the composition of the training datasets for LLMs and whether they encompass a sufficient diversity of arithmetic expressions to instill such a numerical bias.

Implications and Future Directions

These findings have practical implications for the design and deployment of educational technology using LLMs. The biases observed in text comprehension and solution planning underscore the models' potential to mimic human-like reasoning in earlier problem-solving stages. Consequently, educators and technologists should consider these cognitive biases when leveraging LLMs for educational purposes.

However, the absence of the carry effect bias suggests that caution is warranted when relying on LLMs to replicate human-like numerical reasoning. Therefore, careful validation against human cognitive processes is vital, especially if these models are to be used as accurate representations of student problem-solving behaviors.

As an extension of this work, future research might delve into other cognitive biases that are absent in adults but potentially present in children. Exploring different instructional prompting strategies could also provide insights into how models replicate nuanced human thought processes. Finally, analyzing the behaviour of LLMs across various languages might uncover additional layers of complexity in cognitive modeling.