Discovering Language-neutral Sub-networks in Multilingual Language Models (2205.12672v2)

Published 25 May 2022 in cs.CL

Abstract: Multilingual pre-trained LLMs transfer remarkably well on cross-lingual downstream tasks. However, the extent to which they learn language-neutral representations (i.e., shared representations that encode similar phenomena across languages), and the effect of such representations on cross-lingual transfer performance, remain open questions. In this work, we conceptualize language neutrality of multilingual models as a function of the overlap between language-encoding sub-networks of these models. We employ the lottery ticket hypothesis to discover sub-networks that are individually optimized for various languages and tasks. Our evaluation across three distinct tasks and eleven typologically-diverse languages demonstrates that sub-networks for different languages are topologically similar (i.e., language-neutral), making them effective initializations for cross-lingual transfer with limited performance degradation.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Negar Foroutan (10 papers)
Mohammadreza Banaei (8 papers)
Remi Lebret (23 papers)
Antoine Bosselut (85 papers)
Karl Aberer (44 papers)

Citations (26)

View on Semantic Scholar

Tweets

https://twitter.com/nc_znc/status/1752978662226309625

Discovering Language-neutral Sub-networks in Multilingual Language Models (2205.12672v2)

Related Papers

Tweets