Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation (1905.01964v1)

Published 26 Apr 2019 in cs.CL, cs.LG, and stat.ML

Abstract: Chinese named entity recognition (CNER) is an important task in Chinese natural language processing field. However, CNER is very challenging since Chinese entity names are highly context-dependent. In addition, Chinese texts lack delimiters to separate words, making it difficult to identify the boundary of entities. Besides, the training data for CNER in many domains is usually insufficient, and annotating enough training data for CNER is very expensive and time-consuming. In this paper, we propose a neural approach for CNER. First, we introduce a CNN-LSTM-CRF neural architecture to capture both local and long-distance contexts for CNER. Second, we propose a unified framework to jointly train CNER and word segmentation models in order to enhance the ability of CNER model in identifying entity boundaries. Third, we introduce an automatic method to generate pseudo labeled samples from existing labeled data which can enrich the training data. Experiments on two benchmark datasets show that our approach can effectively improve the performance of Chinese named entity recognition, especially when training data is insufficient.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Fangzhao Wu (81 papers)
  2. Junxin Liu (3 papers)
  3. Chuhan Wu (87 papers)
  4. Yongfeng Huang (110 papers)
  5. Xing Xie (220 papers)
Citations (76)

Summary

We haven't generated a summary for this paper yet.