Analysis of Primacy Effect in LLMs
This paper investigates the presence of the primacy effect in LLMs, focusing specifically on ChatGPT, Gemini, and Claude. Utilizing a conceptual framework inspired by Asch's 1946 psychological experiments, the authors examine how these models process adjectival descriptions in different sequences and whether they demonstrate biases similar to those observed in human cognition.
Experimental Design and Results
Two experiments were conducted to probe the primacy effect. The first experiment presented each LLM with descriptions of two candidates simultaneously, each having adjectives listed in a different order—one beginning with positive adjectives followed by negative ones, and vice versa. Notably, ChatGPT exhibited a tendency to prefer candidates with positive adjectives listed first, whereas Gemini showed no preference, and Claude consistently refused to choose.
The second experiment refined the task by presenting candidate descriptions individually, asking the LLMs to rate each candidate on a scale of 1 to 5. This setup forced Claude to provide ratings, circumventing its initial refusal seen in the simultaneous presentation. Across this experiment, ChatGPT and Claude primarily assigned equal ratings to both candidates but showed a preference for candidates with negative adjectives listed first when not assigning equal ratings—a pattern that was notably pronounced in Gemini.
Discussion and Implications
The findings reveal inconsistent behaviors among the models in their susceptibility to the primacy effect. ChatGPT's inclination towards candidates described positively at the outset, as well as Gemini's preference reversal when negative adjectives appeared first, highlights the variability contingent upon model architecture and training paradigms. This inconsistency across different LLMs underscores potential ethical concerns, particularly in areas like automated decision-making, where cognitive biases could impact fairness and transparency.
These experiments are pivotal in understanding the extent to which LLMs are influenced by cognitive biases akin to human psychology—a critical consideration for researchers and developers. The implications of such biases are pronounced in domains where equitability is paramount, and users might not have the expertise to detect or mitigate algorithmic bias.
Future Directions
To address the challenges posed by cognitive biases in LLMs, further research needs to delve into developing robust metrics for bias evaluation and enhancing transparency in model development. Critical collaborative efforts between AI developers, psychologists, and ethicists are essential to mitigate unintended bias, ensuring the responsible deployment of LLMs in sensitive applications.
Ultimately, this paper contributes to the broader discourse on the alignment of AI cognitive processes with human psychology—highlighting areas requiring attention to potentially adverse biases inherent in these models. As AI systems permeate various aspects of societal operations, these insights offer valuable direction for future advancements in LLM safety and ethics.