Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Dynamic Embedded Topic Model (1907.05545v2)

Published 12 Jul 2019 in cs.CL and stat.ML

Abstract: Topic modeling analyzes documents to learn meaningful patterns of words. For documents collected in sequence, dynamic topic models capture how these patterns vary over time. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution parameterized by the inner product between the word embedding and a per-time-step embedding representation of its assigned topic. The D-ETM learns smooth topic trajectories by defining a random walk prior over the embedding representations of the topics. We fit the D-ETM using structured amortized variational inference with a recurrent neural network. On three different corpora---a collection of United Nations debates, a set of ACL abstracts, and a dataset of Science Magazine articles---we found that the D-ETM outperforms D-LDA on a document completion task. We further found that the D-ETM learns more diverse and coherent topics than D-LDA while requiring significantly less time to fit.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Adji B. Dieng (12 papers)
  2. Francisco J. R. Ruiz (22 papers)
  3. David M. Blei (110 papers)
Citations (84)

Summary

We haven't generated a summary for this paper yet.