GraphMatch: Subgraph Query Processing on FPGAs (2402.17559v1)
Abstract: Efficiently finding subgraph embeddings in large graphs is crucial for many application areas like biology and social network analysis. Set intersections are the predominant and most challenging aspect of current join-based subgraph query processing systems for CPUs. Previous work has shown the viability of utilizing FPGAs for acceleration of graph and join processing. In this work, we propose GraphMatch, the first genearl-purpose stand-alone subgraph query processing accelerator based on worst-case optimal joins (WCOJ) that is fully designed for modern, field programmable gate array (FPGA) hardware. For efficient processing of various graph data sets and query graph patterns, it leverages a novel set intersection approach, called AllCompare, tailor-made for FPGAs. We show that this set intersection approach efficiently solves multi-set intersections in subgraph query processing, superior to CPU-based approaches. Overall, GraphMatch achieves a speedup of over 2.68x and 5.16x, compared to the state-of-the-art systems GraphFlow and RapidMatch, respectively.
- EmptyHeaded: A Relational Engine for Graph Processing. ACM Trans. Database Syst. 42, 4 (2017), 20:1–20:44.
- CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching. In SIGMOD. ACM, 1447–1462.
- Efficient Subgraph Matching by Postponing Cartesian Products. In SIGMOD. ACM, 1199–1214.
- ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture. In FPGA. 217–226.
- GraphScale: Scalable Bandwidth-Efficient Graph Processing on FPGAs. In FPL. IEEE, 24–32.
- Non-relational Databases on FPGAs: Survey, Design Decisions, Challenges. ACM Comput. Surv. 55, 11 (2023), 225:1–225:37.
- PipeJSON: Parsing JSON at Line Speed on FPGAs. In DaMoN. 3:1–3:7.
- Adopting Worst-Case Optimal Joins in Relational Database Systems. PVLDB 13, 11 (2020), 1891–1904.
- EdgeFrame: Worst-Case Optimal Joins for Graph-Pattern Matching in Spark. In GRADES-NDA, Akhil Arora, Semih Salihoglu, and Nikolay Yakovets (Eds.). ACM, 4:1–4:11.
- Efficient Subgraph Matching: Harmonizing Dynamic Programming, Adaptive Matching Order, and Failing Set Together. In SIGMOD. ACM, 1429–1446.
- Speeding Up Set Intersections in Graph Algorithms using SIMD Instructions. In SIGMOD. ACM, 1587–1602.
- FAST: FPGA-based Subgraph Matching on Massive Graphs. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19-22, 2021. IEEE, 1452–1463. https://doi.org/10.1109/ICDE51399.2021.00129
- Bandwidth-optimal Relational Joins on FPGAs. In EDBT. 1:27–1:39.
- An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases. Proc. VLDB Endow. 6, 2 (2012), 133–144.
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
- GraphZero: A High-Performance Subgraph Matching System. ACM SIGOPS Oper. Syst. Rev. 55, 1 (2021), 21–37.
- Amine Mhedhbi and Semih Salihoglu. 2019. Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins. PVLDB 12, 11 (2019), 1692–1704.
- Worst-case Optimal Join Algorithms. J. ACM 65, 3 (2018), 16:1–16:40.
- Skew strikes back: new developments in the theory of join algorithms. SIGMOD Rec. 42, 4 (2013), 5–16.
- Efficient estimation of graphlet frequency distributions in protein-protein interaction networks. Bioinform. 22, 8 (2006), 974–980.
- Ryan A. Rossi and Nesreen K. Ahmed. 2015. The Network Data Repository with Interactive Graph Analytics and Visualization. http://networkrepository.com. In AAAI.
- The ubiquity of large graphs and surprising challenges of graph processing: extended survey. VLDB J. 29, 2-3 (2020), 595–618.
- New specifications for exponential random graph models. Sociological methodology 36, 1 (2006), 99–153.
- Shixuan Sun and Qiong Luo. 2020. In-Memory Subgraph Matching: An In-depth Study. In SIGMOD. ACM, 1083–1098.
- RapidMatch: A Holistic Approach to Subgraph Query Processing. Proc. VLDB Endow. 14, 2 (2020), 176–188.
- Julian R. Ullmann. 1976. An Algorithm for Subgraph Isomorphism. J. ACM 23, 1 (1976), 31–42.
- Todd L. Veldhuizen. 2012. Leapfrog Triejoin: a worst-case optimal join algorithm. CoRR abs/1210.0481 (2012).