Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Better Character Language Modeling Through Morphology (1906.01037v2)

Published 3 Jun 2019 in cs.CL

Abstract: We incorporate morphological supervision into character LLMs (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and LLMing data are disjoint. Analyzing the CLMs shows that inflected words benefit more from explicitly modeling morphology than uninflected words, and that morphological supervision improves performance even as the amount of LLMing data grows. We then transfer morphological supervision across languages to improve LLMing performance in the low-resource setting.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Terra Blevins (20 papers)
  2. Luke Zettlemoyer (225 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.