Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NER-to-MRC: Named-Entity Recognition Completely Solving as Machine Reading Comprehension (2305.03970v1)

Published 6 May 2023 in cs.CL

Abstract: Named-entity recognition (NER) detects texts with predefined semantic labels and is an essential building block for NLP. Notably, recent NER research focuses on utilizing massive extra data, including pre-training corpora and incorporating search engines. However, these methods suffer from high costs associated with data collection and pre-training, and additional training process of the retrieved data from search engines. To address the above challenges, we completely frame NER as a machine reading comprehension (MRC) problem, called NER-to-MRC, by leveraging MRC with its ability to exploit existing data efficiently. Several prior works have been dedicated to employing MRC-based solutions for tackling the NER problem, several challenges persist: i) the reliance on manually designed prompts; ii) the limited MRC approaches to data reconstruction, which fails to achieve performance on par with methods utilizing extensive additional data. Thus, our NER-to-MRC conversion consists of two components: i) transform the NER task into a form suitable for the model to solve with MRC in a efficient manner; ii) apply the MRC reasoning strategy to the model. We experiment on 6 benchmark datasets from three domains and achieve state-of-the-art performance without external data, up to 11.24% improvement on the WNUT-16 dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuxiang Zhang (104 papers)
  2. Junjie Wang (164 papers)
  3. Xinyu Zhu (28 papers)
  4. Tetsuya Sakai (30 papers)
  5. Hayato Yamana (9 papers)
Citations (2)