Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF (1704.01314v3)

Published 5 Apr 2017 in cs.CL

Abstract: We present a character-based model for joint segmentation and POS tagging for Chinese. The bidirectional RNN-CRF architecture for general sequence tagging is adapted and applied with novel vector representations of Chinese characters that capture rich contextual information and lower-than-character level features. The proposed model is extensively evaluated and compared with a state-of-the-art tagger respectively on CTB5, CTB9 and UD Chinese. The experimental results indicate that our model is accurate and robust across datasets in different sizes, genres and annotation schemes. We obtain state-of-the-art performance on CTB5, achieving 94.38 F1-score for joint segmentation and POS tagging.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yan Shao (13 papers)
  2. Christian Hardmeier (20 papers)
  3. Jörg Tiedemann (41 papers)
  4. Joakim Nivre (30 papers)
Citations (105)