Papers
Topics
Authors
Recent
2000 character limit reached

Moss: A Scalable Tool for Efficiently Sampling and Counting 4- and 5-Node Graphlets

Published 27 Sep 2015 in cs.SI | (1509.08089v4)

Abstract: Counting the frequencies of 3-, 4-, and 5-node undirected motifs (also know as graphlets) is widely used for understanding complex networks such as social and biology networks. However, it is a great challenge to compute these metrics for a large graph due to the intensive computation. Despite recent efforts to count triangles (i.e., 3-node undirected motif counting), little attention has been given to developing scalable tools that can be used to characterize 4- and 5-node motifs. In this paper, we develop computational efficient methods to sample and count 4- and 5- node undirected motifs. Our methods provide unbiased estimators of motif frequencies, and we derive simple and exact formulas for the variances of the estimators. Moreover, our methods are designed to fit vertex centric programming models, so they can be easily applied to current graph computing systems such as Pregel and GraphLab. We conduct experiments on a variety of real-word datasets, and experimental results show that our methods are several orders of magnitude faster than the state-of-the-art methods under the same estimation errors.

Citations (16)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.