2000 character limit reached
Embedding Learning Through Multilingual Concept Induction (1801.06807v3)
Published 21 Jan 2018 in cs.CL
Abstract: We present a new method for estimating vector space representations of words: embedding learning by concept induction. We test this method on a highly parallel corpus and learn semantic representations of words in 1259 different languages in a single common space. An extensive experimental evaluation on crosslingual word similarity and sentiment analysis indicates that concept-based multilingual embedding learning performs better than previous approaches.