The Emergence of Abstract Thought in Large Language Models Beyond Any Language (2506.09890v1)

Published 11 Jun 2025 in cs.CL and cs.AI

Abstract: As LLMs continue to advance, their capacity to function effectively across a diverse range of languages has shown marked improvement. Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts. This has led to the widespread assumption that LLMs may "think" in English. However, more recent results showing strong multilingual performance, even surpassing English performance on specific tasks in other languages, challenge this view. In this work, we find that LLMs progressively develop a core language-agnostic parameter space-a remarkably small subset of parameters whose deactivation results in significant performance degradation across all languages. This compact yet critical set of parameters underlies the model's ability to generalize beyond individual languages, supporting the emergence of abstract thought that is not tied to any specific linguistic system. Specifically, we identify language-related neurons-those are consistently activated during the processing of particular languages, and categorize them as either shared (active across multiple languages) or exclusive (specific to one). As LLMs undergo continued development over time, we observe a marked increase in both the proportion and functional importance of shared neurons, while exclusive neurons progressively diminish in influence. These shared neurons constitute the backbone of the core language-agnostic parameter space, supporting the emergence of abstract thought. Motivated by these insights, we propose neuron-specific training strategies tailored to LLMs' language-agnostic levels at different development stages. Experiments across diverse LLM families support our approach.

Authors (10)

Yuxin Chen (195 papers)
Yiran Zhao (26 papers)
Yang Zhang (1129 papers)
An Zhang (78 papers)
Kenji Kawaguchi (147 papers)
Shafiq Joty (187 papers)
Junnan Li (56 papers)
Tat-Seng Chua (360 papers)
Michael Qizhe Shieh (8 papers)
Wenxuan Zhang (75 papers)

Summary

The paper demonstrates that as LLMs evolve, language-shared neurons emerge as critical components for abstract, language-agnostic thought.
It employs parallel neuron detection and deactivation impact measures to distinguish language-related neurons in key model layers.
The study proposes targeted training strategies based on neuron importance that substantially enhance multilingual performance and reasoning.

This paper, "The Emergence of Abstract Thought in LLMs Beyond Any Language" (Chen et al., 11 Jun 2025 ), investigates how LLMs process information across multiple languages, challenging the common belief that they primarily "think" in English. The authors propose that as LLMs evolve, they develop a compact, critical set of parameters that are language-agnostic and support abstract thought, transcending specific linguistic systems.

The core of the paper's methodology lies in identifying and analyzing "language-related neurons" within LLMs. A neuron is defined as a row or column in the model's parameter matrices. A neuron is considered language-related if its deactivation significantly impacts the model's output embedding when processing text in that language. The impact is quantified using the L2 norm of the change in the output embedding after zeroing out the neuron's parameters. To address the computational cost of sequential deactivation, the paper leverages parallel neuron detection methods (detailed in Appendix A), which measure the impact at the layer level (specifically FFN and Self-Attention layer outputs).

Language-related neurons are further categorized:

Language-Shared Neurons: Activated consistently across all languages studied.
Language-Exclusive Neurons: Activated for a specific language but not shared across all.

The paper introduces two key metrics:

Shared Neuron Ratio: The ratio of shared neurons to the average number of exclusive neurons across languages. This metric quantifies the extent to which multilingual processing relies on common versus specialized neural components.
Language-Shared Neuron Importance (Language Agnostic Score): This measures the relative functional impact of shared neurons compared to exclusive neurons, quantified by the change in perplexity upon deactivation. A higher score indicates that shared neurons are more critical to multilingual processing than exclusive ones, suggesting a move towards language-agnostic, abstract functions.

Through experiments on 20 open-source LLMs (Llama, Qwen, Gemma series) across six diverse languages (Chinese, English, Thai, Swahili, French, German), the paper presents several key findings:

Only a small fraction of neurons (less than 1%) are identified as language-related, highlighting sparsity.
The proportion of language-shared neurons increases consistently with model evolution and improved multilingual performance, both within specific model series and across different model families. Models with better multilingual capabilities tend to leverage a higher ratio of shared neurons.
In early-stage LLMs, shared neurons have similar functional importance to exclusive neurons.
In recent, more advanced LLMs, shared neurons exhibit a disproportionately higher functional importance. Deactivating shared neurons causes a much greater increase in perplexity across languages compared to deactivating an equal number of exclusive neurons. This shift signifies that shared neurons evolve into critical, language-agnostic components supporting abstract functions like semantic reasoning and generalization. Random neuron deactivation has minimal impact, validating the specificity of the identified language-related neurons.

Motivated by these findings, the paper proposes neuron-specific training strategies tailored to the model's language-agnostic level:

For LLMs with a low language-agnostic score (under-trained), training any language-related neurons (both shared and exclusive) is beneficial.
For LLMs with a medium language-agnostic score, training language-shared neurons is recommended, as they are more numerous but not yet fully language-agnostic.
For LLMs with a high language-agnostic score (shared neurons are already language-agnostic), focus training on language-exclusive neurons to capture language-specific nuances.

Practical implementation of this targeted training involves identifying the neurons and then applying training updates only to the parameters corresponding to the selected set of neurons (e.g., masking updates for non-selected parameters). The experiments validating this approach involved continuous pretraining on a multilingual corpus (Culturax, MADLAD, Wikipedia samples) using representative models (Llama 3.2-1B, 3.2-3B, Llama 3.1-8B for low, medium, high scores, respectively). Evaluating on MGSM (reasoning-heavy) and MMMLU (knowledge-heavy) benchmarks, the results demonstrate that tuning the targeted neuron sets based on the language-agnostic score effectively enhances multilingual performance, particularly on reasoning tasks like MGSM (e.g., Llama 3.1-8B saw a 4.0 point average improvement on MGSM by tuning exclusive neurons, and Llama 3.2-3B saw a 5.7 point gain tuning shared neurons). This indicates that the strategy primarily boosts the model's "thinking" capability rather than just factual recall.

Implementation considerations include the computational cost of neuron identification and importance calculation, which requires analyzing activation patterns and perplexity changes across a corpus for each language. The parallel detection methods help but it remains resource-intensive, especially for larger models. The definition of a neuron and the threshold for activation ( $\sigma$ ) are empirical choices that can influence results. Targeted training requires infrastructure capable of selectively updating specific subsets of model parameters. The approach relies on the specific definition of language-related neurons based on output embedding changes, and alternative definitions might reveal different patterns. While validated up to 9B parameters, scaling this analysis and training strategy to models with hundreds of billions or trillions of parameters presents a significant computational challenge.

The practical implications are significant for developing more efficient and capable multilingual LLMs. By understanding which parts of the model contribute to language-specific versus language-agnostic processing, developers can design more targeted training and fine-tuning strategies. This could lead to improved cross-lingual transfer, better performance on low-resource languages, and potentially the development of smaller, more efficient models for specific multilingual applications by leveraging the core language-agnostic components. This neuron-centric view provides a valuable tool for interpreting and enhancing the complex multilingual abilities of LLMs.

PDF Markdown

Related Papers

Tweets

https://twitter.com/ai_database/status/1933146260372148686

https://twitter.com/ZainHasan6/status/1933765938891141390

YouTube

Show All Videos