2000 character limit reached
Character-Word LSTM Language Models (1704.02813v1)
Published 10 Apr 2017 in cs.CL
Abstract: We present a Character-Word Long Short-Term Memory LLM which both reduces the perplexity with respect to a baseline word-level LLM and reduces the number of parameters of the model. Character information can reveal structural (dis)similarities between words and can even be used when a word is out-of-vocabulary, thus improving the modeling of infrequent and unknown words. By concatenating word and character embeddings, we achieve up to 2.77% relative improvement on English compared to a baseline model with a similar amount of parameters and 4.57% on Dutch. Moreover, we also outperform baseline word-level models with a larger number of parameters.
- Lyan Verwimp (11 papers)
- Joris Pelemans (7 papers)
- Hugo Van hamme (59 papers)
- Patrick Wambacq (5 papers)