Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LTSG: Latent Topical Skip-Gram for Mutually Learning Topic Model and Vector Representations (1702.07117v1)

Published 23 Feb 2017 in cs.CL

Abstract: Topic models have been widely used in discovering latent topics which are shared across documents in text mining. Vector representations, word embeddings and topic embeddings, map words and topics into a low-dimensional and dense real-value vector space, which have obtained high performance in NLP tasks. However, most of the existing models assume the result trained by one of them are perfect correct and used as prior knowledge for improving the other model. Some other models use the information trained from external large corpus to help improving smaller corpus. In this paper, we aim to build such an algorithm framework that makes topic models and vector representations mutually improve each other within the same corpus. An EM-style algorithm framework is employed to iteratively optimize both topic model and vector representations. Experimental results show that our model outperforms state-of-art methods on various NLP tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jarvan Law (2 papers)
  2. Hankz Hankui Zhuo (35 papers)
  3. Junhua He (3 papers)
  4. Erhu Rong (1 paper)
Citations (18)

Summary

We haven't generated a summary for this paper yet.