Character-Word LSTM Language Models (1704.02813v1)

Published 10 Apr 2017 in cs.CL

Abstract: We present a Character-Word Long Short-Term Memory LLM which both reduces the perplexity with respect to a baseline word-level LLM and reduces the number of parameters of the model. Character information can reveal structural (dis)similarities between words and can even be used when a word is out-of-vocabulary, thus improving the modeling of infrequent and unknown words. By concatenating word and character embeddings, we achieve up to 2.77% relative improvement on English compared to a baseline model with a similar amount of parameters and 4.57% on Dutch. Moreover, we also outperform baseline word-level models with a larger number of parameters.

Authors (4)

Lyan Verwimp (11 papers)
Joris Pelemans (7 papers)
Hugo Van hamme (59 papers)
Patrick Wambacq (5 papers)

Citations (51)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Character-Word LSTM Language Models (1704.02813v1)

Summary

Related Papers