Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual Learning (2004.14218v2)

Published 29 Apr 2020 in cs.CL and cs.LG

Abstract: Recently, fine-tuning pre-trained LLMs (e.g., multilingual BERT) to downstream cross-lingual tasks has shown promising results. However, the fine-tuning process inevitably changes the parameters of the pre-trained model and weakens its cross-lingual ability, which leads to sub-optimal performance. To alleviate this problem, we leverage continual learning to preserve the original cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks. The experimental result shows that our fine-tuning methods can better preserve the cross-lingual ability of the pre-trained model in a sentence retrieval task. Our methods also achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zihan Liu (102 papers)
  2. Genta Indra Winata (94 papers)
  3. Andrea Madotto (64 papers)
  4. Pascale Fung (150 papers)
Citations (18)