Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond (2010.12405v3)

Published 23 Oct 2020 in cs.CL

Abstract: Cross-lingual adaptation with multilingual pre-trained LLMs (mPTLMs) mainly consists of two lines of works: zero-shot approach and translation-based approach, which have been studied extensively on the sequence-level tasks. We further verify the efficacy of these cross-lingual adaptation approaches by evaluating their performances on more fine-grained sequence tagging tasks. After re-examining their strengths and drawbacks, we propose a novel framework to consolidate the zero-shot approach and the translation-based approach for better adaptation performance. Instead of simply augmenting the source data with the machine-translated data, we tailor-make a warm-up mechanism to quickly update the mPTLMs with the gradients estimated on a few translated data. Then, the adaptation approach is applied to the refined parameters and the cross-lingual transfer is performed in a warm-start way. The experimental results on nine target languages demonstrate that our method is beneficial to the cross-lingual adaptation of various sequence tagging tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xin Li (980 papers)
  2. Lidong Bing (144 papers)
  3. Wenxuan Zhang (75 papers)
  4. Zheng Li (326 papers)
  5. Wai Lam (117 papers)
Citations (21)

Summary

We haven't generated a summary for this paper yet.