Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ConNER: Consistency Training for Cross-lingual Named Entity Recognition (2211.09394v1)

Published 17 Nov 2022 in cs.CL

Abstract: Cross-lingual named entity recognition (NER) suffers from data scarcity in the target languages, especially under zero-shot settings. Existing translate-train or knowledge distillation methods attempt to bridge the language gap, but often introduce a high level of noise. To solve this problem, consistency training methods regularize the model to be robust towards perturbations on data or hidden states. However, such methods are likely to violate the consistency hypothesis, or mainly focus on coarse-grain consistency. We propose ConNER as a novel consistency training framework for cross-lingual NER, which comprises of: (1) translation-based consistency training on unlabeled target-language data, and (2) dropoutbased consistency training on labeled source-language data. ConNER effectively leverages unlabeled target-language data and alleviates overfitting on the source language to enhance the cross-lingual adaptability. Experimental results show our ConNER achieves consistent improvement over various baseline methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ran Zhou (35 papers)
  2. Xin Li (980 papers)
  3. Lidong Bing (144 papers)
  4. Erik Cambria (136 papers)
  5. Luo Si (73 papers)
  6. Chunyan Miao (145 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.