Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Re-weighting Tokens: A Simple and Effective Active Learning Strategy for Named Entity Recognition (2311.00906v1)

Published 2 Nov 2023 in cs.CL and cs.LG

Abstract: Active learning, a widely adopted technique for enhancing machine learning models in text and image classification tasks with limited annotation resources, has received relatively little attention in the domain of Named Entity Recognition (NER). The challenge of data imbalance in NER has hindered the effectiveness of active learning, as sequence labellers lack sufficient learning signals. To address these challenges, this paper presents a novel reweighting-based active learning strategy that assigns dynamic smoothed weights to individual tokens. This adaptable strategy is compatible with various token-level acquisition functions and contributes to the development of robust active learners. Experimental results on multiple corpora demonstrate the substantial performance improvement achieved by incorporating our re-weighting strategy into existing acquisition functions, validating its practical efficacy.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Haocheng Luo (5 papers)
  2. Wei Tan (55 papers)
  3. Ngoc Dang Nguyen (8 papers)
  4. Lan Du (46 papers)
Citations (2)