Papers
Topics
Authors
Recent
2000 character limit reached

Graphsurge: Graph Analytics on View Collections Using Differential Computation

Published 11 Apr 2020 in cs.DB | (2004.05297v2)

Abstract: This paper presents the design and implementation of a new open-source view-based graph analytics system called Graphsurge. Graphsurge is designed to support applications that analyze multiple snapshots or views of a large-scale graph. Users program Graphsurge through a declarative graph view definition language (GVDL) to create views over input graphs and a Differential Dataflow-based programming API to write analytics computations. A key feature of GVDL is the ability to organize views into view collections, which allows Graphsurge to automatically share computation across views, without users writing any incrementalization code, by performing computations differentially. We then introduce two optimization problems that naturally arise in our setting. First is the collection ordering problem to determine the order of views that leads to minimum differences across consecutive views. We prove this problem is NP-hard and show a constant-factor approximation algorithm drawn from literature. Second is the collection splitting problem to decide on which views to run computations differentially vs from scratch, for which we present an adaptive solution that makes decisions at runtime. We present extensive experiments to demonstrate the benefits of running computations differentially for view collections and our collection ordering and splitting optimizations.

Citations (7)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.