Indexing Techniques for Graph Reachability Queries (2311.03542v2)
Abstract: We survey graph reachability indexing techniques for efficient processing of graph reachability queries in two types of popular graph models: plain graphs and edge-labeled graphs. Reachability queries are Boolean in nature, determining whether a directed path exists between a given source and target vertex. They form a core class of navigational queries in graph processing. Reachability indexes are specialized data structures designed to accelerate reachability query processing. Work on this topic goes back four decades -- we include 33 of the proposed techniques. Plain graphs contain only vertices and edges, with reachability queries checking path existence between a source and target vertex. Edge-labeled graphs, in contrast, augment plain graphs by adding edge labels. Reachability queries in edge-labeled graphs incorporate path constraints based on edge labels, assessing both path existence and compliance with path constraints. We categorize techniques in both plain and edge-labeled graphs and discuss the approaches according to this classification, using existing techniques as exemplars. We discuss the main challenges within each class and how these might be addressed in other approaches. We conclude with a discussion of the open research challenges and future research directions, along the lines of integrating reachability indexes into modern graph database management systems. This survey serves as a comprehensive resource for researchers and practitioners interested in the advancements, techniques, and challenges on reachability indexing in graph analytics.