Do Llamas Work in English? On the Latent Language of Multilingual Transformers (2402.10588v4)

Published 16 Feb 2024 in cs.CL and cs.CY

Abstract: We ask whether multilingual LLMs trained on unbalanced, English-dominated corpora use English as an internal pivot language -- a question of key importance for understanding how LLMs function and the origins of linguistic bias. Focusing on the Llama-2 family of transformer models, our study uses carefully constructed non-English prompts with a unique correct single-token continuation. From layer to layer, transformers gradually map an input embedding of the final prompt token to an output embedding from which next-token probabilities are computed. Tracking intermediate embeddings through their high-dimensional space reveals three distinct phases, whereby intermediate embeddings (1) start far away from output token embeddings; (2) already allow for decoding a semantically correct next token in the middle layers, but give higher probability to its version in English than in the input language; (3) finally move into an input-language-specific region of the embedding space. We cast these results into a conceptual model where the three phases operate in "input space", "concept space", and "output space", respectively. Crucially, our evidence suggests that the abstract "concept space" lies closer to English than to other languages, which may have important consequences regarding the biases held by multilingual LLMs.

Authors (4)

Chris Wendler (22 papers)
Veniamin Veselovsky (17 papers)
Giovanni Monea (6 papers)
Robert West (154 papers)

Citations (57)

View on Semantic Scholar

Summary

Exploring the Latent Language of Multilingual Transformers

Introduction to the Study

In the arena of LLMs, multilingual capabilities have become a pivotal aspect of research and application. With the dominance of English in the training corpora for many of these models, a question arises: do these models resort to using English as an internal pivot language when dealing with non-English inputs? A paper by Chris Wendler et al. at EPFL dives deep into this subject, focusing on the Llama-2 family of transformer models. The team's investigation reveals complex layers of linguistic processing that shed light on both theoretical and practical implications of multilingual LLMs.

Probing the Internal Mechanics

The paper presents a methodology to examine how Llama-2 models process language information across different layers. The authors utilize a technique termed the "logit lens" to prematurely apply the unembedding operation in intermediate transformer layers. This approach allows for the observation of how likely the model is to predict certain tokens at various stages of processing. Specific tasks—translation, repetition, and cloze tests—designed with controlled conditions enable the identification of whether the model leans towards English or the target language during processing.

Key Findings

Across all tasks and model sizes, the experiments consistently show a three-phase linguistic journey within the transformer architecture:

High entropy and indifferent language probabilities characterize the initial phase, indicating a build-up of context devoid of strong language preference.
A middle phase where English takes the lead, suggesting an internal pivot towards English, possibly due to the model's heavy exposure to English during training.
The final phase sees a rapid shift towards the target language, accompanied by a decrease in entropy and an increase in token energy, indicating focused language-specific processing.

Implications and Future Directions

The observed propensity towards English in the middle layers of processing, despite eventual correct output in the target language, carries significant implications. This phenomenon demonstrates an inherent English-centric bias within these models, which could influence their handling of concepts uniquely expressed in non-English languages. On a more theoretical note, these findings suggest that the internal representation within LLMs may constitute an "abstract concept space" that inherently leans towards English semantics.

Moving forward, the paper calls for further exploration of methods to mitigate this linguistic bias. It highlights the potential of employing diverse and balanced training datasets and revisiting tokenization strategies as avenues to enable more equitable multilingual processing. Moreover, understanding the abstract concept space's structure more intimately may unlock new strategies for designing LLMs with truly global linguistic fluency.

Conclusion

Through rigorous experimentation and analysis, Wendler et al.'s research contributes valuable insights into the latent linguistic operations of multilingual transformers. By illuminating the implicit pivot towards English in these models, the paper not only broadens our understanding of LLMs but also prompts a critical reflection on the way forward towards more inclusive, diverse, and fair AI language technologies.

PDF Markdown

Related Papers

Tweets

https://twitter.com/cervisiarius/status/1759989584371298554

https://twitter.com/manoelribeiro/status/1760007020240089199

https://twitter.com/fly51fly/status/1761328420062109994

https://twitter.com/devoto_alessio/status/1870890514402476176

https://twitter.com/VminVsky/status/1835849985977676275

https://twitter.com/omouamoua/status/1866965285368107322

YouTube

Show All Videos

HackerNews

Do Llamas Work in English? On the Latent Language of Multilingual Transformers (1 point, 0 comments)