Anthropocentric bias in language model evaluation (2407.03859v2)

Published 4 Jul 2024 in cs.CL

Abstract: Evaluating the cognitive capacities of LLMs requires overcoming not only anthropomorphic but also anthropocentric biases. This article identifies two types of anthropocentric bias that have been neglected: overlooking how auxiliary factors can impede LLM performance despite competence ("auxiliary oversight"), and dismissing LLM mechanistic strategies that differ from those of humans as not genuinely competent ("mechanistic chauvinism"). Mitigating these biases necessitates an empirically-driven, iterative approach to mapping cognitive tasks to LLM-specific capacities and mechanisms, which can be done by supplementing carefully designed behavioral experiments with mechanistic studies.

Summary

The paper identifies Type-I and Type-II anthropocentric biases that skew the evaluation of LLMs' cognitive capacities.
It demonstrates that auxiliary task demands, computational limits, and mechanistic interference can obscure true artificial competence.
It advocates an iterative empirical approach combining behavioral experiments and mechanistic studies to objectively assess AI cognition.

Overview of "Anthropocentric bias and the possibility of artificial cognition"

In the paper titled "Anthropocentric bias and the possibility of artificial cognition," Millière and Rathkopf probe the conceptual and methodological biases inherent in evaluating the cognitive capacities of LLMs vis-à-vis human cognitive abilities. The authors identify two primary types of anthropocentric biases: Type-I and Type-II. They argue that both need to be understood and mitigated in order to fairly and accurately assess the true capabilities of LLMs.

Type-I Anthropocentrism

Type-I anthropocentrism refers to the assumption that performance failures of LLMs in specific tasks are definitive evidence of a lack of competence. This overlooks potential auxiliary factors that might hinder performance. Three categories of auxiliary factors are identified:

Auxiliary Task Demands: These arise when LLMs are required to perform extra tasks, such as making explicit metalinguistic judgments, which are not directly related to the underlying competence of interest. For example, Hu and Frank found that LLMs perform better on syntactic tasks when evaluated by direct probability estimation rather than explicit metalinguistic judgments.
Computational Limitations: Constraints on the expressive power of Transformers can limit their performance on tasks requiring multiple computational steps. Pfau et al. demonstrated that LLMs could solve the 3SUM problem perfectly when allowed enough intermediate steps but failed otherwise.
Mechanistic Interference: Interference from competing computations can obscure the competent processes within an LLM. Studies by Nanda and Zhong illustrated how multiple circuits within an LLM can interact in ways that degrade overall performance despite the presence of competent mechanisms.

Type-II Anthropocentrism

Type-II anthropocentrism pertains to the assumption that genuine cognitive competence must mirror human cognitive strategies. The authors argue that cognitive kinds should not be narrowly defined to fit human-specific mechanisms. Competence should instead be gauged based on the generality and flexibility of the strategies employed by the LLMs, independent of their resemblance to human cognition.

Empirical Evaluation and Iterative Approaches

The paper posits that evaluating LLMs' cognitive capacities must be an empirical endeavor, focusing on algorithmic rather than physical implementation details. The authors advocate for an iterative process combining carefully designed behavioral experiments with mechanistic studies. This approach aims to empirically map cognitive tasks to LLM-specific capacities and mechanisms.

Implications and Future Directions

The implications of this research are profound both theoretically and practically. Theoretically, it challenges the entrenched notion that human cognitive systems are the gold standard for evaluating artificial intelligence. Practically, it underscores the need for a nuanced methodology, potentially revitalizing the empirical paper of AI cognition.

Speculatively, this iterative approach could lead to new ontologies of cognitive kinds tailored to LLMs. Such advancements would enable more robust and general artificial cognition systems, diversifying beyond human-centric frameworks.

Conclusion

The paper "Anthropocentric bias and the possibility of artificial cognition" provides a rigorous analysis of the biases that currently skew the evaluation of LLMs. By identifying Type-I and Type-II anthropocentric biases and advocating for an empirically driven, iterative framework, Millière and Rathkopf lay the groundwork for a more objective and comprehensive assessment of artificial cognition. This work has the potential to foster future developments in AI, enabling the creation of systems with cognitive capacities that, while different from humans, are no less competent.

PDF Markdown

Related Papers

Tweets

https://twitter.com/raphaelmilliere/status/1810260295585009809

https://twitter.com/CRathkopf/status/1810393022611468382

https://twitter.com/tunadorable/status/1821540858652450990

https://twitter.com/tunadorable/status/1821501615482032549

https://twitter.com/ImMr_Wise/status/1821549352847818805

YouTube

Show All Videos