2000 character limit reached
Learning Meta-Embeddings by Using Ensembles of Embedding Sets (1508.04257v2)
Published 18 Aug 2015 in cs.CL
Abstract: Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in NLP. However, different embedding sets vary greatly in quality and characteristics of the captured semantics. Instead of relying on a more advanced algorithm for embedding learning, this paper proposes an ensemble approach of combining different public embedding sets with the aim of learning meta-embeddings. Experiments on word similarity and analogy tasks and on part-of-speech tagging show better performance of meta-embeddings compared to individual embedding sets. One advantage of meta-embeddings is the increased vocabulary coverage. We will release our meta-embeddings publicly.
- Wenpeng Yin (69 papers)
- Hinrich Schütze (250 papers)