Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning (2305.13628v2)

Published 23 May 2023 in cs.CL

Abstract: In cross-lingual named entity recognition (NER), self-training is commonly used to bridge the linguistic gap by training on pseudo-labeled target-language data. However, due to sub-optimal performance on target languages, the pseudo labels are often noisy and limit the overall performance. In this work, we aim to improve self-training for cross-lingual NER by combining representation learning and pseudo label refinement in one coherent framework. Our proposed method, namely ContProto mainly comprises two components: (1) contrastive self-training and (2) prototype-based pseudo-labeling. Our contrastive self-training facilitates span classification by separating clusters of different classes, and enhances cross-lingual transferability by producing closely-aligned representations between the source and target language. Meanwhile, prototype-based pseudo-labeling effectively improves the accuracy of pseudo labels during training. We evaluate ContProto on multiple transfer pairs, and experimental results show our method brings in substantial improvements over current state-of-the-art methods.

Authors (5)

Ran Zhou (35 papers)
Xin Li (980 papers)
Lidong Bing (144 papers)
Erik Cambria (136 papers)
Chunyan Miao (145 papers)

Citations (11)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning (2305.13628v2)

Summary

Related Papers