Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
112 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

External memory bisimulation reduction of big graphs (1210.0748v3)

Published 2 Oct 2012 in cs.DB and cs.DS

Abstract: In this paper, we present, to our knowledge, the first known I/O efficient solutions for computing the k-bisimulation partition of a massive directed graph, and performing maintenance of such a partition upon updates to the underlying graph. Ubiquitous in the theory and application of graph data, bisimulation is a robust notion of node equivalence which intuitively groups together nodes in a graph which share fundamental structural features. k-bisimulation is the standard variant of bisimulation where the topological features of nodes are only considered within a local neighborhood of radius $k\geqslant 0$. The I/O cost of our partition construction algorithm is bounded by $O(k\cdot \mathit{sort}(|\et|) + k\cdot scan(|\nt|) + \mathit{sort}(|\nt|))$, while our maintenance algorithms are bounded by $O(k\cdot \mathit{sort}(|\et|) + k\cdot \mathit{sort}(|\nt|))$. The space complexity bounds are $O(|\nt|+|\et|)$ and $O(k\cdot|\nt|+k\cdot|\et|)$, resp. Here, $|\et|$ and $|\nt|$ are the number of disk pages occupied by the input graph's edge set and node set, resp., and $\mathit{sort}(n)$ and $\mathit{scan}(n)$ are the cost of sorting and scanning, resp., a file occupying $n$ pages in external memory. Empirical analysis on a variety of massive real-world and synthetic graph datasets shows that our algorithms perform efficiently in practice, scaling gracefully as graphs grow in size.

Citations (37)

Summary

We haven't generated a summary for this paper yet.