2000 character limit reached
Modeling Word Relatedness in Latent Dirichlet Allocation (1411.2328v1)
Published 10 Nov 2014 in cs.CL and cs.AI
Abstract: Standard LDA model suffers the problem that the topic assignment of each word is independent and word correlation hence is neglected. To address this problem, in this paper, we propose a model called Word Related Latent Dirichlet Allocation (WR-LDA) by incorporating word correlation into LDA topic models. This leads to new capabilities that standard LDA model does not have such as estimating infrequently occurring words or multi-language topic modeling. Experimental results demonstrate the effectiveness of our model compared with standard LDA.