2000 character limit reached
Subword ELMo (1909.08357v1)
Published 18 Sep 2019 in cs.CL
Abstract: Embedding from LLMs (ELMo) has shown to be effective for improving many NLP tasks, and ELMo takes character information to compose word representation to train LLMs.However, the character is an insufficient and unnatural linguistic unit for word representation.Thus we introduce Embedding from Subword-aware LLMs (ESuLMo) which learns word representation from subwords using unsupervised segmentation over words.We show that ESuLMo can enhance four benchmark NLP tasks more effectively than ELMo, including syntactic dependency parsing, semantic role labeling, implicit discourse relation recognition and textual entailment, which brings a meaningful improvement over ELMo.
- Jiangtong Li (24 papers)
- Hai Zhao (227 papers)
- Zuchao Li (76 papers)
- Wei Bi (62 papers)
- Xiaojiang Liu (27 papers)