Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond (2010.12405v3)

Published 23 Oct 2020 in cs.CL

Abstract: Cross-lingual adaptation with multilingual pre-trained LLMs (mPTLMs) mainly consists of two lines of works: zero-shot approach and translation-based approach, which have been studied extensively on the sequence-level tasks. We further verify the efficacy of these cross-lingual adaptation approaches by evaluating their performances on more fine-grained sequence tagging tasks. After re-examining their strengths and drawbacks, we propose a novel framework to consolidate the zero-shot approach and the translation-based approach for better adaptation performance. Instead of simply augmenting the source data with the machine-translated data, we tailor-make a warm-up mechanism to quickly update the mPTLMs with the gradients estimated on a few translated data. Then, the adaptation approach is applied to the refined parameters and the cross-lingual transfer is performed in a warm-start way. The experimental results on nine target languages demonstrate that our method is beneficial to the cross-lingual adaptation of various sequence tagging tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xin Li (980 papers)
  2. Lidong Bing (144 papers)
  3. Wenxuan Zhang (75 papers)
  4. Zheng Li (326 papers)
  5. Wai Lam (117 papers)
Citations (21)