Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HAMNER: Headword Amplified Multi-span Distantly Supervised Method for Domain Specific Named Entity Recognition (1912.01731v1)

Published 3 Dec 2019 in cs.CL

Abstract: To tackle Named Entity Recognition (NER) tasks, supervised methods need to obtain sufficient cleanly annotated data, which is labor and time consuming. On the contrary, distantly supervised methods acquire automatically annotated data using dictionaries to alleviate this requirement. Unfortunately, dictionaries hinder the effectiveness of distantly supervised methods for NER due to its limited coverage, especially in specific domains. In this paper, we aim at the limitations of the dictionary usage and mention boundary detection. We generalize the distant supervision by extending the dictionary with headword based non-exact matching. We apply a function to better weight the matched entity mentions. We propose a span-level model, which classifies all the possible spans then infers the selected spans with a proposed dynamic programming algorithm. Experiments on all three benchmark datasets demonstrate that our method outperforms previous state-of-the-art distantly supervised methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shifeng Liu (8 papers)
  2. Yifang Sun (8 papers)
  3. Bing Li (374 papers)
  4. Wei Wang (1793 papers)
  5. Xiang Zhao (60 papers)
Citations (26)

Summary

We haven't generated a summary for this paper yet.