Cross-neutralising: Probing for joint encoding of linguistic information in multilingual models (2010.12825v2)

Published 24 Oct 2020 in cs.CL

Abstract: Multilingual sentence encoders are widely used to transfer NLP models across languages. The success of this transfer is, however, dependent on the model's ability to encode the patterns of cross-lingual similarity and variation. Yet, little is known as to how these models are able to do this. We propose a simple method to study how relationships between languages are encoded in two state-of-the-art multilingual models (i.e. M-BERT and XLM-R). The results provide insight into their information sharing mechanisms and suggest that linguistic properties are encoded jointly across typologically-similar languages in these models.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (2)

Rochelle Choenni (17 papers)
Ekaterina Shutova (52 papers)

Citations (1)

View on Semantic Scholar

Cross-neutralising: Probing for joint encoding of linguistic information in multilingual models (2010.12825v2)

Related Papers