Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

From Word Segmentation to POS Tagging for Vietnamese (1711.04951v1)

Published 14 Nov 2017 in cs.CL

Abstract: This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output of a word segmenter as the input of a POS tagger, and (ii) a joint strategy where we predict a combined segmentation and POS tag for each syllable. We also make a comparison between state-of-the-art (SOTA) feature-based and neural network-based models. On the benchmark Vietnamese treebank (Nguyen et al., 2009), experimental results show that the pipeline strategy produces better scores of POS tagging from unsegmented text than the joint strategy, and the highest accuracy is obtained by using a feature-based model.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Dat Quoc Nguyen (55 papers)
  2. Thanh Vu (59 papers)
  3. Dai Quoc Nguyen (26 papers)
  4. Mark Dras (38 papers)
  5. Mark Johnson (46 papers)
Citations (29)

Summary

We haven't generated a summary for this paper yet.