Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition (2305.12485v2)

Published 21 May 2023 in cs.CL and cs.AI

Abstract: Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets, which always obtain using crowdsourcing. However, it is hard to obtain a unified and correct label via majority voting from multiple annotators for NER due to the large labeling space and complexity of this task. To address this problem, we aim to utilize the original multi-annotator labels directly. Particularly, we propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER. This model learns a token- and content-dependent confidence via an Expectation-Maximization (EM) algorithm by minimizing empirical risk. The true posterior estimator and confidence estimator perform iteratively to update the true posterior and confidence respectively. We conduct extensive experimental results on both real-world and synthetic datasets, which show that our model can improve performance effectively compared with strong baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Limao Xiong (9 papers)
  2. Jie Zhou (687 papers)
  3. Qunxi Zhu (12 papers)
  4. Xiao Wang (507 papers)
  5. Yuanbin Wu (47 papers)
  6. Qi Zhang (785 papers)
  7. Tao Gui (127 papers)
  8. Xuanjing Huang (287 papers)
  9. Jin Ma (64 papers)
  10. Ying Shan (252 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.