How important is language for human-like intelligence?

Published 19 Sep 2025 in cs.CL | (2509.15560v1)

Abstract: We use language to communicate our thoughts. But is language merely the expression of thoughts, which are themselves produced by other, nonlinguistic parts of our minds? Or does language play a more transformative role in human cognition, allowing us to have thoughts that we otherwise could (or would) not have? Recent developments in AI and cognitive science have reinvigorated this old question. We argue that language may hold the key to the emergence of both more general AI systems and central aspects of human intelligence. We highlight two related properties of language that make it such a powerful tool for developing domain--general abilities. First, language offers compact representations that make it easier to represent and reason about many abstract concepts (e.g., exact numerosity). Second, these compressed representations are the iterated output of collective minds. In learning a language, we learn a treasure trove of culturally evolved abstractions. Taken together, these properties mean that a sufficiently powerful learning system exposed to language--whether biological or artificial--learns a compressed model of the world, reverse engineering many of the conceptual and causal structures that support human (and human-like) thought.

Abstract PDF Upgrade to Chat

Summary

The paper reveals that language is a transformative substrate that compresses culturally evolved abstractions to support flexible, domain-general cognition.
Empirical evidence from cognitive science and neuroscience shows that language modulates neural processing and is crucial for advanced reasoning and problem-solving.
Analyses of large language models indicate that training on linguistic data enables emergent cognitive abilities in AI, surpassing models trained solely on nonlinguistic information.

The Transformative Role of Language in Human and Artificial Intelligence

Introduction

The paper "How important is language for human-like intelligence?" (2509.15560) critically examines the foundational role of language in the emergence and structure of human cognition and its implications for the development of artificial general intelligence (AGI). The authors challenge the traditional view that language is merely a communicative tool or an externalization of pre-existing thought, arguing instead that language fundamentally shapes the nature and scope of human intelligence. By synthesizing evidence from cognitive science, neuroscience, and recent advances in LLMs, the paper posits that language provides both a compressed representational substrate and a repository of culturally evolved abstractions, enabling the development of domain-general cognitive abilities in both biological and artificial systems.

Language as a Compressed, Culturally Evolved Representational System

The authors emphasize two interrelated properties of language that are central to its cognitive power:

Compact Representations: Language enables the efficient encoding and manipulation of abstract concepts, such as exact numerosity, that are otherwise difficult to represent or reason about through direct sensorimotor experience alone. The vocabulary and grammatical structures of natural languages serve as generative compression schemes, allowing individuals to acquire and recombine complex abstractions with minimal cognitive overhead.
Collective Intelligence: Language is the product of cumulative cultural evolution, reflecting the collective cognitive labor of generations. By learning a language, individuals inherit a vast array of pre-discovered abstractions and conceptual distinctions, obviating the need to reinvent them through individual experience. This scaffolding effect is argued to be a key driver of the distinctive flexibility and generality of human cognition.

Empirical Evidence from Cognitive Science and Neuroscience

The paper reviews converging evidence that language is not merely a vehicle for expressing thought but actively shapes cognitive development and performance:

Developmental and Clinical Studies: Children deprived of conventional linguistic input (e.g., deaf children without access to sign language) exhibit deficits in theory of mind and spatial reasoning. Adults with aphasia show impairments in fluid reasoning and selective attention, even on tasks traditionally considered "nonverbal." Experimental manipulations of linguistic availability during cognitive tasks causally affect performance, particularly in category learning and rule-based reasoning.
Neural Dissociations and Modularity: While neuroimaging studies reveal dissociable brain networks for linguistic and nonlinguistic tasks, the authors argue that such modularity reflects the specialization of certain linguistic processes rather than a strict independence of language and thought. Language modulates perceptual processing at both low and high levels, and its influence is evident in the alignment of visual representations between humans and neural networks trained with language supervision.

Implications from LLMs

The emergence of LLMs provides a novel testbed for evaluating the cognitive role of language:

Generalization Beyond Language: Transformer-based LLMs, trained primarily on next-token prediction, acquire not only linguistic competence but also pragmatic inference, systematicity, and the ability to perform a wide range of downstream tasks (e.g., medical diagnosis, summarization) without explicit task-specific training. These abilities emerge despite the absence of specialized language-learning modules, suggesting that exposure to language alone is sufficient to induce general cognitive capabilities.
Comparative Performance: The authors predict that neural networks trained exclusively on nonlinguistic data will struggle to match the breadth of human-like intelligence exhibited by language-trained models, particularly in domains requiring relational, analogical, or theory-of-mind reasoning.
Mechanistic Considerations: The paper cautions against conflating formal linguistic competence (e.g., syntactic agreement) with functional competence (e.g., reasoning, planning), noting that LLMs can perform both but likely rely on distinct computational mechanisms. Notably, prompting LLMs to use internal language (analogous to inner speech) can enhance performance on complex reasoning tasks.

Theoretical and Practical Implications

The paper's central thesis has significant implications for both cognitive science and AI:

Reframing the Language-Thought Relationship: The evidence supports a view in which language is not merely a tool for communication but a constitutive element of human-like intelligence. The acquisition of language enables individuals to internalize and manipulate culturally evolved abstractions, facilitating the development of flexible, domain-general cognitive abilities.
Design of Artificial General Intelligence: The success of LLMs suggests that exposing artificial systems to natural language is a highly effective strategy for inducing general intelligence. The open-endedness and abstraction density of language make it an ideal substrate for learning the latent structure of the world, as perceived and conceptualized by humans.
Limitations and Scaling Considerations: The authors acknowledge that LLMs require orders of magnitude more data and computational resources than humans and that their performance may not generalize to all domains of intelligence. Nevertheless, the parallels between human and machine learning from language underscore the centrality of linguistic input in the emergence of general intelligence.

Future Directions

The paper suggests several avenues for future research:

Systematic Comparisons: Direct empirical comparisons between language-trained and nonlinguistic models on a range of cognitive tasks are needed to delineate the specific contributions of language to general intelligence.
Mechanistic Elucidation: Further analysis of the internal representations and circuit specializations that emerge in LLMs may shed light on the computational mechanisms by which language scaffolds cognition.
Cultural and Linguistic Diversity: The impact of linguistic diversity and the potential homogenization effects of LLMs on both artificial and human cognition warrant further investigation.

Conclusion

This paper advances a compelling argument that language is not merely an externalization of thought but a transformative substrate for the development of human-like intelligence. By providing compact, culturally evolved abstractions, language enables both biological and artificial systems to construct flexible, generalizable internal models of the world. The successes of LLMs, alongside evidence from cognitive science and neuroscience, support the view that language is a key ingredient in the emergence of general intelligence. Future research should further elucidate the mechanisms by which language shapes cognition and explore the implications for the design of more robust and general artificial intelligence systems.

Markdown