Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GraphMini: Accelerating Graph Pattern Matching Using Auxiliary Graphs (2403.01050v1)

Published 2 Mar 2024 in cs.DB and cs.PF

Abstract: Graph pattern matching is a fundamental problem encountered by many common graph mining tasks and the basic building block of several graph mining systems. This paper explores for the first time how to proactively prune graphs to speed up graph pattern matching by leveraging the structure of the query pattern and the input graph. We propose building auxiliary graphs, which are different pruned versions of the graph, during query execution. This requires careful balancing between the upfront cost of building and managing auxiliary graphs and the gains of faster set operations. To this end, we propose GraphMini, a new system that uses query compilation and a new cost model to minimize the cost of building and maintaining auxiliary graphs and maximize gains. Our evaluation shows that using GraphMini can achieve one order of magnitude speedup compared to state-of-the-art subgraph enumeration systems on commonly used benchmarks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. W. Fan, “Graph pattern matching revised for social network analysis,” ACM International Conference Proceeding Series, 03 2012.
  2. T. A. B. Snijders, P. E. Pattison, G. L. Robins, and M. S. Handcock, “New specifications for exponential random graph models,” Sociological Methodology, vol. 36, no. 1, pp. 99–153, 2006. [Online]. Available: https://doi.org/10.1111/j.1467-9531.2006.00176.x
  3. N. Alon, P. Dao, I. Hajirasouliha, F. Hormozdiari, and C. Sahinalp, “Biomolecular network motif counting and discovery by color coding,” Bioinformatics (Oxford, England), vol. 24, pp. i241–9, 07 2008.
  4. K. Jamshidi, R. Mahadasa, and K. Vora, “Peregrine: A pattern-aware graph mining system,” in Proceedings of the Fifteenth European Conference on Computer Systems, ser. EuroSys ’20.   New York, NY, USA: Association for Computing Machinery, 2020. [Online]. Available: https://doi.org/10.1145/3342195.3387548
  5. D. Mawhirter and B. Wu, “Automine: Harmonizing high-level abstraction and high performance for graph mining,” in Proceedings of the 27th ACM Symposium on Operating Systems Principles, ser. SOSP ’19.   New York, NY, USA: Association for Computing Machinery, 2019, p. 509–523. [Online]. Available: https://doi.org/10.1145/3341301.3359633
  6. D. Mawhirter, S. Reinehr, C. Holmes, T. Liu, and B. Wu, “Graphzero: Breaking symmetry for efficient graph mining,” 2019. [Online]. Available: https://arxiv.org/abs/1911.12877
  7. D. Mawhirter, S. Reinehr, W. Han, N. Fields, M. Claver, C. Holmes, J. McClurg, T. Liu, and B. Wu, “Dryadic: Flexible and fast graph pattern matching at scale,” in 30th International Conference on Parallel Architectures and Compilation Techniques, PACT 2021, Atlanta, GA, USA, September 26-29, 2021, J. Lee and A. Cohen, Eds.   IEEE, 2021, pp. 289–303. [Online]. Available: https://doi.org/10.1109/PACT52795.2021.00028
  8. S. Han, L. Zou, and J. X. Yu, “Speeding up set intersections in graph algorithms using simd instructions,” in Proceedings of the 2018 International Conference on Management of Data, ser. SIGMOD ’18.   New York, NY, USA: Association for Computing Machinery, 2018, p. 1587–1602. [Online]. Available: https://doi.org/10.1145/3183713.3196924
  9. T. Shi, M. Zhai, Y. Xu, and J. Zhai, “Graphpi: High performance graph pattern matching through effective redundancy elimination,” in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 2020, pp. 1–14.
  10. C. R. Aberger, A. Lamb, S. Tu, A. Nötzli, K. Olukotun, and C. Ré, “Emptyheaded: A relational engine for graph processing,” ACM Trans. Database Syst., vol. 42, no. 4, oct 2017. [Online]. Available: https://doi.org/10.1145/3129246
  11. X. Chen, T. Huang, S. Xu, T. Bourgeat, C. Chung, and A. Arvind, “Flexminer: A pattern-aware accelerator for graph pattern mining,” in 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021, pp. 581–594.
  12. G. Dai, Z. Zhu, T. Fu, C. Wei, B. Wang, X. Li, Y. Xie, H. Yang, and Y. Wang, “Dimmining: Pruning-efficient and parallel graph mining on near-memory-computing,” in Proceedings of the 49th Annual International Symposium on Computer Architecture, ser. ISCA ’22.   New York, NY, USA: Association for Computing Machinery, 2022, p. 130–145. [Online]. Available: https://doi.org/10.1145/3470496.3527388
  13. C. H. C. Teixeira, A. J. Fonseca, M. Serafini, G. Siganos, M. J. Zaki, and A. Aboulnaga, “Arabesque: A system for distributed graph mining,” in Proceedings of the 25th Symposium on Operating Systems Principles, ser. SOSP ’15.   New York, NY, USA: Association for Computing Machinery, 2015, p. 425–440. [Online]. Available: https://doi.org/10.1145/2815400.2815410
  14. J. R. Ullmann, “An algorithm for subgraph isomorphism,” J. ACM, vol. 23, no. 1, p. 31–42, jan 1976. [Online]. Available: https://doi.org/10.1145/321921.321925
  15. J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap.stanford.edu/data, Jun. 2014.
  16. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: A system for large-scale graph processing,” in Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’10.   New York, NY, USA: Association for Computing Machinery, 2010, p. 135–146. [Online]. Available: https://doi.org/10.1145/1807167.1807184
  17. J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica, “Graphx: Graph processing in a distributed dataflow framework,” in Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI’14.   USA: USENIX Association, 2014, p. 599–613.
  18. J. Shun and G. E. Blelloch, “Ligra: A lightweight graph processing framework for shared memory,” in Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP ’13.   New York, NY, USA: Association for Computing Machinery, 2013, p. 135–146. [Online]. Available: https://doi.org/10.1145/2442516.2442530
  19. K. Wang, Z. Zuo, J. Thorpe, T. Q. Nguyen, and G. H. Xu, “Rstream: Marrying relational algebra with streaming for efficient graph mining on a single machine,” in Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI’18.   USA: USENIX Association, 2018, p. 763–782.
  20. X. Chen, R. Dathathri, G. Gill, and K. Pingali, “Pangolin: An efficient and flexible graph mining system on cpu and gpu,” Proc. VLDB Endow., vol. 13, no. 8, p. 1190–1205, apr 2020. [Online]. Available: https://doi.org/10.14778/3389133.3389137
  21. V. Dias, C. H. C. Teixeira, D. Guedes, W. Meira, and S. Parthasarathy, “Fractal: A general-purpose graph pattern mining system,” in Proceedings of the 2019 International Conference on Management of Data, ser. SIGMOD ’19.   New York, NY, USA: Association for Computing Machinery, 2019, p. 1357–1374. [Online]. Available: https://doi.org/10.1145/3299869.3319875
  22. S. Sun and Q. Luo, “In-memory subgraph matching: An in-depth study,” in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’20.   New York, NY, USA: Association for Computing Machinery, 2020, p. 1083–1098. [Online]. Available: https://doi.org/10.1145/3318464.3380581
  23. S. Sun, X. Sun, Y. Che, Q. Luo, and B. He, “Rapidmatch: A holistic approach to subgraph query processing,” Proc. VLDB Endow., vol. 14, no. 2, p. 176–188, oct 2020. [Online]. Available: https://doi.org/10.14778/3425879.3425888
  24. L. Xiang, A. Khan, E. Serra, M. Halappanavar, and A. Sukumaran-Rajam, “Cuts: Scaling subgraph isomorphism on distributed multi-gpu systems using trie based data structure,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC ’21.   New York, NY, USA: Association for Computing Machinery, 2021. [Online]. Available: https://doi.org/10.1145/3458817.3476214
  25. H. He and A. K. Singh, “Graphs-at-a-time: Query language and access methods for graph databases,” in Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’08.   New York, NY, USA: Association for Computing Machinery, 2008, p. 405–418. [Online]. Available: https://doi.org/10.1145/1376616.1376660
  26. F. Bi, L. Chang, X. Lin, L. Qin, and W. Zhang, “Efficient subgraph matching by postponing cartesian products,” in Proceedings of the 2016 International Conference on Management of Data, ser. SIGMOD ’16.   New York, NY, USA: Association for Computing Machinery, 2016, p. 1199–1214. [Online]. Available: https://doi.org/10.1145/2882903.2915236
  27. M. Han, H. Kim, G. Gu, K. Park, and W.-S. Han, “Efficient subgraph matching: Harmonizing dynamic programming, adaptive matching order, and failing set together,” in Proceedings of the 2019 International Conference on Management of Data, ser. SIGMOD ’19.   New York, NY, USA: Association for Computing Machinery, 2019, p. 1429–1446. [Online]. Available: https://doi.org/10.1145/3299869.3319880
  28. B. Bhattarai, H. Liu, and H. H. Huang, “Ceci: Compact embedding cluster index for scalable subgraph matching,” in Proceedings of the 2019 International Conference on Management of Data, ser. SIGMOD ’19.   New York, NY, USA: Association for Computing Machinery, 2019, p. 1447–1462. [Online]. Available: https://doi.org/10.1145/3299869.3300086
  29. W.-S. Han, J. Lee, and J.-H. Lee, “Turboiso: Towards ultrafast and robust subgraph isomorphism search in large graph databases,” in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’13.   New York, NY, USA: Association for Computing Machinery, 2013, p. 337–348. [Online]. Available: https://doi.org/10.1145/2463676.2465300
  30. H. Kim, Y. Choi, K. Park, X. Lin, S.-H. Hong, and W.-S. Han, “Versatile equivalences: Speeding up subgraph query processing and subgraph matching,” in Proceedings of the 2021 International Conference on Management of Data, ser. SIGMOD ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 925–937. [Online]. Available: https://doi.org/10.1145/3448016.3457265
  31. H. Kim, J. Lee, S. S. Bhowmick, W.-S. Han, J. Lee, S. Ko, and M. H. Jarrah, “Dualsim: Parallel subgraph enumeration in a massive graph on a single machine,” in Proceedings of the 2016 International Conference on Management of Data, ser. SIGMOD ’16.   New York, NY, USA: Association for Computing Machinery, 2016, p. 1231–1245. [Online]. Available: https://doi.org/10.1145/2882903.2915209
  32. Z. Yang, L. Lai, X. Lin, K. Hao, and W. Zhang, “Huge: An efficient and scalable subgraph enumeration system,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 2049–2062.
  33. S. Sun, Y. Che, L. Wang, and Q. Luo, “Efficient parallel subgraph enumeration on a single machine,” in 2019 IEEE 35th International Conference on Data Engineering (ICDE), 2019, pp. 232–243.
  34. M. Qiao, H. Zhang, and H. Cheng, “Subgraph matching: On compression and computation,” Proc. VLDB Endow., vol. 11, no. 2, p. 176–188, oct 2017. [Online]. Available: https://doi.org/10.14778/3149193.3149198
  35. W. Guo, Y. Li, and K.-L. Tan, “Exploiting reuse for gpu subgraph enumeration,” IEEE Transactions on Knowledge and Data Engineering, 2020.
  36. S. Sahu, A. Mhedhbi, S. Salihoglu, J. Lin, and M. T. Özsu, “The ubiquity of large graphs and surprising challenges of graph processing,” Proc. VLDB Endow., vol. 11, no. 4, p. 420–431, dec 2017. [Online]. Available: https://doi.org/10.1145/3164135.3164139
  37. M. Besta, E. Peter, R. Gerstenberger, M. Fischer, M. Podstawski, C. Barthels, G. Alonso, and T. Hoefler, “Demystifying graph databases: Analysis and taxonomy of data organization, system designs, and graph queries,” 2019. [Online]. Available: https://arxiv.org/abs/1910.09017
  38. D. Nguyen, M. Aref, M. Bravenboer, G. Kollias, H. Q. Ngo, C. Ré, and A. Rudra, “Join processing for graph patterns: An old dog with new tricks,” in Proceedings of the GRADES’15, ser. GRADES’15.   New York, NY, USA: Association for Computing Machinery, 2015. [Online]. Available: https://doi.org/10.1145/2764947.2764948
  39. A. Mhedhbi and S. Salihoglu, “Optimizing subgraph queries by combining binary and worst-case optimal joins,” arXiv preprint arXiv:1903.02076, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Juelin Liu (3 papers)
  2. Sandeep Polisetty (7 papers)
  3. Hui Guan (34 papers)
  4. Marco Serafini (17 papers)

Summary

We haven't generated a summary for this paper yet.