Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rapid Adaptation of POS Tagging for Domain Specific Uses (1411.0007v1)

Published 31 Oct 2014 in cs.CL, cs.LG, and stat.ML

Abstract: Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such as parsing, information extraction, and question answering. When POS taggers are trained in one domain and applied in significantly different domains, their performance can degrade dramatically. We present a methodology for rapid adaptation of POS taggers to new domains. Our technique is unsupervised in that a manually annotated corpus for the new domain is not necessary. We use suffix information gathered from large amounts of raw text as well as orthographic information to increase the lexical coverage. We present an experiment in the Biological domain where our POS tagger achieves results comparable to POS taggers specifically trained to this domain.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. John E. Miller (2 papers)
  2. Michael Bloodgood (26 papers)
  3. Manabu Torii (3 papers)
  4. K. Vijay-Shanker (10 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.