Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics (2404.19178v2)

Published 30 Apr 2024 in cs.CL

Abstract: Transformers have generally supplanted recurrent neural networks as the dominant architecture for both natural language processing tasks and for modelling the effect of predictability on online human language comprehension. However, two recently developed recurrent model architectures, RWKV and Mamba, appear to perform natural language tasks comparably to or better than transformers of equivalent scale. In this paper, we show that contemporary recurrent models are now also able to match - and in some cases, exceed - the performance of comparably sized transformers at modeling online human language comprehension. This suggests that transformer LLMs are not uniquely suited to this task, and opens up new directions for debates about the extent to which architectural features of LLMs make them better or worse models of human language comprehension.

PDF Abstract

Exploring the Performance of Recurrent Neural Networks and Transformers in Language Comprehension Tasks

Introduction to the Debate

Recent advancements in AI have brought into question the reigning supremacy of transformer models in NLP tasks. Traditionally favored for their impressive performance in numerous language understanding benchmarks, transformers now face competition from two newly introduced recurrent neural network (RNN) architectures, RWKV and Mamba. This comparison is not just technical but touches on a deeper question: which architecture models human language comprehension more effectively?

Recurrent Networks vs. Transformers: A Conceptual Overview

Transformers have typically been preferred in NLP for their ability to handle long-range dependencies and their efficiency in parallel computation. However, these models operate with a fixed-length context window, potentially oversimplifying the dynamic and continuous nature of human language processing.

Recurrent Neural Networks (RNNs), including newer architectures like RWKV and Mamba, inherently model sequential information where outputs from previous steps are fed back into the network, mimicking a more continuous absorption of linguistic context similar to human cognition.

Key Takeaways from Recent Study

Performance Comparison: The paper compared transformers with the RWKV and Mamba recurrent architectures across several language comprehension datasets. Surprisingly, RNNs matched or even outperformed transformers in several cases, challenging the notion that transformers are inherently superior for such tasks.
Metrics Analyzed: The models were evaluated on their ability to predict human language comprehension through various metrics, including N400 (a neural marker of language processing) and different reading time studies.
Scaling Effects Observed: Larger models generally performed better up to a point, but interestingly, this trend reversed with some reading time metrics, suggesting that the biggest models are not always the best at approximating human language processing.

Implications for AI and Cognitive Science

The paper's findings highlight a critical reconsideration of how model architecture influences the simulation of human linguistic capabilities. By demonstrating that RNNs can compete with or exceed transformers in specific tasks, it suggests that the cognitive plausibility of RNNs might make them more suitable for applications that require modeling human-like language processing. Furthermore, this comparison opens discussions on the trade-offs between the architectural strengths of both model types.

Future Directions in AI Development

Given the nuanced performance differences revealed in the paper, future research might focus on:

Hybrid Models: Combining the strengths of RNNs and transformers to create more robust models that leverage the benefits of both architectures.
Fine-tuning for Human-like Processing: More targeted adjustments to model training and architecture could enhance the capacity of AI to mimic human cognitive processes, not just outperform on standard benchmarks.
Broader Applications: Exploring how these insights apply to other areas of AI outside NLP, such as in generative tasks or non-language-based learning.

Conclusion

This paper serves as a prompt for AI researchers to reconsider established beliefs about model architectures in language comprehension tasks. As the technology evolves, so too does our understanding of the intricate relationship between human cognition and machine learning models. Continued exploration in this area will not only advance AI technologies but also deepen our understanding of the very nature of human language processing.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

James A. Michaelov (13 papers)
Catherine Arnett (13 papers)
Benjamin K. Bergen (31 papers)

Related Papers

Find Related Papers

Tweets

https://twitter.com/jamichaelov/status/1785719606973436173

https://twitter.com/linguist_cat/status/1828429776668741784

YouTube

Show All Videos