Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Clustering via Content-Augmented Stochastic Blockmodels (1505.06538v1)

Published 25 May 2015 in stat.ML, cs.LG, and cs.SI

Abstract: Much of the data being created on the web contains interactions between users and items. Stochastic blockmodels, and other methods for community detection and clustering of bipartite graphs, can infer latent user communities and latent item clusters from this interaction data. These methods, however, typically ignore the items' contents and the information they provide about item clusters, despite the tendency of items in the same latent cluster to share commonalities in content. We introduce content-augmented stochastic blockmodels (CASB), which use item content together with user-item interaction data to enhance the user communities and item clusters learned. Comparisons to several state-of-the-art benchmark methods, on datasets arising from scientists interacting with scientific articles, show that content-augmented stochastic blockmodels provide highly accurate clusters with respect to metrics representative of the underlying community structure.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. J. Massey Cashore (3 papers)
  2. Xiaoting Zhao (8 papers)
  3. Alexander A. Alemi (33 papers)
  4. Yujia Liu (27 papers)
  5. Peter I. Frazier (44 papers)

Summary

We haven't generated a summary for this paper yet.