Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer (2309.10891v1)

Published 19 Sep 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data to improve cross-lingual transferability, which are typically expensive to obtain. In this paper, we propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer of the multilingual pretrained LLMs without the help of such external data. By incorporating code-switching and embedding mixup with self-augmentation, SALT effectively distills cross-lingual knowledge from the multilingual PLM and enhances its transferability on downstream tasks. Experimental results on XNLI and PAWS-X show that our method is able to improve zero-shot cross-lingual transferability without external data. Our code is available at https://github.com/luka-group/SALT.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (42)

Authors (4)

Fei Wang (573 papers)
Kuan-Hao Huang (33 papers)
Kai-Wei Chang (292 papers)
Muhao Chen (159 papers)

Citations (3)

View on Semantic Scholar

Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer (2309.10891v1)

Related Papers