2000 character limit reached
Neural Vector Conceptualization for Word Vector Space Interpretation (1904.01500v1)
Published 2 Apr 2019 in cs.CL and cs.LG
Abstract: Distributed word vector spaces are considered hard to interpret which hinders the understanding of NLP models. In this work, we introduce a new method to interpret arbitrary samples from a word vector space. To this end, we train a neural model to conceptualize word vectors, which means that it activates higher order concepts it recognizes in a given vector. Contrary to prior approaches, our model operates in the original vector space and is capable of learning non-linear relations between word vectors and concepts. Furthermore, we show that it produces considerably less entropic concept activation profiles than the popular cosine similarity.