Papers
Topics
Authors
Recent
2000 character limit reached

On a Topic Model for Sentences (1606.00253v1)

Published 1 Jun 2016 in cs.CL, cs.IR, and cs.LG

Abstract: Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text spans such as sentences, contains much information which is generally lost with these models. In this paper, we propose sentenceLDA, an extension of LDA whose goal is to overcome this limitation by incorporating the structure of the text in the generative and inference processes. We illustrate the advantages of sentenceLDA by comparing it with LDA using both intrinsic (perplexity) and extrinsic (text classification) evaluation tasks on different text collections.

Citations (44)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.