Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning (2104.07908v1)

Published 16 Apr 2021 in cs.CL and cs.LG

Abstract: The combination of multilingual pre-trained representations and cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low-resource languages. However, for extremely low-resource languages without large-scale monolingual corpora for pre-training or sufficient annotated data for fine-tuning, transfer learning remains an under-studied and challenging task. Moreover, recent work shows that multilingual representations are surprisingly disjoint across languages, bringing additional challenges for transfer onto extremely low-resource languages. In this paper, we propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from auxiliary languages to a target one and brings their representation spaces closer for effective transfer. Extensive experiments on real-world low-resource languages - without access to large-scale monolingual corpora or large amounts of labeled data - for tasks like cross-lingual sentiment analysis and named entity recognition show the effectiveness of our approach. Code for MetaXL is publicly available at github.com/microsoft/MetaXL.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Mengzhou Xia (34 papers)
  2. Guoqing Zheng (25 papers)
  3. Subhabrata Mukherjee (59 papers)
  4. Milad Shokouhi (14 papers)
  5. Graham Neubig (342 papers)
  6. Ahmed Hassan Awadallah (50 papers)
Citations (29)
Github Logo Streamline Icon: https://streamlinehq.com