Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chinese Word Segmentation: Another Decade Review (2007-2017) (1901.06079v1)

Published 18 Jan 2019 in cs.CL

Abstract: This paper reviews the development of Chinese word segmentation (CWS) in the most recent decade, 2007-2017. Special attention was paid to the deep learning technologies that has already permeated into most areas of NLP. The basic view we have arrived at is that compared to traditional supervised learning methods, neural network based methods have not shown any superior performance. The most critical challenge still lies on balancing of recognition of in-vocabulary (IV) and out-of-vocabulary (OOV) words. However, as neural models have potentials to capture the essential linguistic structure of natural language, we are optimistic about significant progresses may arrive in the near future.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hai Zhao (227 papers)
  2. Deng Cai (181 papers)
  3. Changning Huang (2 papers)
  4. Chunyu Kit (10 papers)
Citations (24)

Summary

We haven't generated a summary for this paper yet.