Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition (2204.04980v1)

Published 11 Apr 2022 in cs.CL

Abstract: Pre-trained LLMs (PLM) are effective components of few-shot named entity recognition (NER) approaches when augmented with continued pre-training on task-specific out-of-domain data or fine-tuning on in-domain data. However, their performance in low-resource scenarios, where such data is not available, remains an open question. We introduce an encoder evaluation framework, and use it to systematically compare the performance of state-of-the-art pre-trained representations on the task of low-resource NER. We analyze a wide range of encoders pre-trained with different strategies, model architectures, intermediate-task fine-tuning, and contrastive learning. Our experimental results across ten benchmark NER datasets in English and German show that encoder performance varies significantly, suggesting that the choice of encoder for a specific low-resource scenario needs to be carefully evaluated.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuxuan Chen (80 papers)
  2. Jonas Mikkelsen (1 paper)
  3. Arne Binder (4 papers)
  4. Christoph Alt (16 papers)
  5. Leonhard Hennig (25 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.