CUTTANA: Scalable Graph Partitioning for Faster Distributed Graph Databases and Analytics (2312.08356v3)
Abstract: Graph partitioning plays a pivotal role in various distributed graph processing applications, including graph analytics, graph neural network training, and distributed graph databases. Graphs that require distributed settings are often too large to fit in the main memory of a single machine. This challenge renders traditional in-memory graph partitioners infeasible, leading to the emergence of streaming solutions. Streaming partitioners produce lower-quality partitions because they work from partial information and must make premature decisions before they have a complete view of a vertex's neighborhood. We introduce CUTTANA, a streaming graph partitioner that partitions massive graphs (Web/Twitter scale) with superior quality compared to existing streaming solutions. CUTTANA uses a novel buffering technique that prevents the premature assignment of vertices to partitions and a scalable coarsening and refinement technique that enables a complete graph view, improving the intermediate assignment made by a streaming partitioner. We implemented a parallel version for CUTTANA that offers nearly the same partitioning latency as existing streaming partitioners. Our experimental analysis shows that CUTTANA consistently yields better partitioning quality than existing state-of-the-art streaming vertex partitioners in terms of both edge-cut and communication volume metrics. We also evaluate the workload latencies that result from using CUTTANA and other partitioners in distributed graph analytics and databases. CUTTANA outperforms the other methods in most scenarios (algorithms, datasets). In analytics applications, CUTTANA improves runtime performance by up to 59% compared to various streaming partitioners (HDRF, Fennel, Ginger, HeiStream). In graph database tasks, CUTTANA results in higher query throughput by up to 23%, without hurting tail latency.
- S. Sahu, A. Mhedhbi, S. Salihoglu, J. Lin, and M. T. Özsu, “The ubiquity of large graphs and surprising challenges of graph processing,” Proceedings of the VLDB Endowment, vol. 11, no. 4, pp. 420–431, 2017.
- J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “{{\{{PowerGraph}}\}}: Distributed {{\{{Graph-Parallel}}\}} computation on natural graphs,” in 10th USENIX symposium on operating systems design and implementation (OSDI 12), 2012, pp. 17–30.
- R. Chen, J. Shi, Y. Chen, B. Zang, H. Guan, and H. Chen, “Powerlyra: Differentiated graph computation and partitioning on skewed graphs,” ACM Transactions on Parallel Computing (TOPC), vol. 5, no. 3, pp. 1–39, 2019.
- [Online]. Available: https://spark.apache.org/graphx/
- W. Fan, T. He, L. Lai, X. Li, Y. Li, Z. Li, Z. Qian, C. Tian, L. Wang, J. Xu et al., “Graphscope: a unified engine for big graph processing,” Proceedings of the VLDB Endowment, vol. 14, no. 12, pp. 2879–2892, 2021.
- D. Li, Y. Zhang, J. Wang, and K.-L. Tan, “Topox: Topology refactorization for efficient graph partitioning and processing,” Proceedings of the VLDB Endowment, vol. 12, no. 8, pp. 891–905, 2019.
- D. Yan, G. Guo, M. M. R. Chowdhury, M. T. Özsu, W.-S. Ku, and J. C. Lui, “G-thinker: A distributed framework for mining subgraphs in a big graph,” in 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 2020, pp. 1369–1380.
- A. P. Iyer, Q. Pu, K. Patel, J. E. Gonzalez, and I. Stoica, “{{\{{TEGRA}}\}}: Efficient {{\{{Ad-Hoc}}\}} analytics on evolving graphs,” in 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), 2021, pp. 337–355.
- X. Wang, D. Wen, L. Qin, L. Chang, and W. Zhang, “Scaleg: A distributed disk-based system for vertex-centric graph processing,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 1511–1512.
- J. Chen and X. Qian, “Khuzdul: Efficient and scalable distributed graph pattern mining engine,” in Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, 2023, pp. 413–426.
- C. Li, H. Chen, S. Zhang, Y. Hu, C. Chen, Z. Zhang, M. Li, X. Li, D. Han, X. Chen et al., “Bytegraph: a high-performance distributed graph database in bytedance,” Proceedings of the VLDB Endowment, vol. 15, no. 12, pp. 3306–3318, 2022.
- C. Buragohain, K. M. Risvik, P. Brett, M. Castro, W. Cho, J. Cowhig, N. Gloy, K. Kalyanaraman, R. Khanna, J. Pao et al., “A1: A distributed in-memory graph database,” in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, pp. 329–344.
- “Titan db.” [Online]. Available: https://titan.thinkaurelius.com/
- “Janusgraph.” [Online]. Available: https://janusgraph.org/
- V. Md, S. Misra, G. Ma, R. Mohanty, E. Georganas, A. Heinecke, D. Kalamkar, N. K. Ahmed, and S. Avancha, “Distgnn: Scalable distributed training for large-scale graph neural networks,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021, pp. 1–14.
- Y. Wang, B. Feng, G. Li, S. Li, L. Deng, Y. Xie, and Y. Ding, “{{\{{GNNAdvisor}}\}}: An adaptive and efficient runtime system for {{\{{GNN}}\}} acceleration on {{\{{GPUs}}\}},” in 15th USENIX symposium on operating systems design and implementation (OSDI 21), 2021, pp. 515–531.
- J. Vatter, R. Mayer, and H.-A. Jacobsen, “The evolution of distributed systems for graph neural networks and their origin in graph processing and deep learning: A survey,” ACM Computing Surveys, 2023.
- D. Zheng, C. Ma, M. Wang, J. Zhou, Q. Su, X. Song, Q. Gan, Z. Zhang, and G. Karypis, “Distdgl: distributed graph neural network training for billion-scale graphs,” in 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 2020, pp. 36–44.
- S. Gandhi and A. P. Iyer, “P3: Distributed deep graph learning at scale,” in 15th {normal-{\{{USENIX}normal-}\}} Symposium on Operating Systems Design and Implementation ({normal-{\{{OSDI}normal-}\}} 21), 2021, pp. 551–568.
- R. R. McCune, T. Weninger, and G. Madey, “Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing,” ACM Computing Surveys (CSUR), vol. 48, no. 2, pp. 1–39, 2015.
- C. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic, “Fennel: Streaming graph partitioning for massive scale graphs,” in Proceedings of the 7th ACM international conference on Web search and data mining, 2014, pp. 333–342.
- M. F. Faraj and C. Schulz, “Buffered streaming graph partitioning,” ACM Journal of Experimental Algorithmics, vol. 27, pp. 1–26, 2022.
- M. R. Garey, D. S. Johnson, and L. Stockmeyer, “Some simplified np-complete problems,” in Proceedings of the sixth annual ACM symposium on Theory of computing, 1974, pp. 47–63.
- Ü. Çatalyürek, K. Devine, M. Faraj, L. Gottesbüren, T. Heuer, H. Meyerhenke, P. Sanders, S. Schlag, C. Schulz, D. Seemaier et al., “More recent advances in (hyper) graph partitioning,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023.
- A. Awadelkarim and J. Ugander, “Prioritized restreaming algorithms for balanced graph partitioning,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1877–1887.
- G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM Journal on scientific Computing, vol. 20, no. 1, pp. 359–392, 1998.
- P. Sanders and C. Schulz, “Engineering multilevel graph partitioning algorithms,” in European Symposium on algorithms. Springer, 2011, pp. 469–480.
- C. Zhang, F. Wei, Q. Liu, Z. G. Tang, and Z. Li, “Graph edge partitioning via neighborhood heuristic,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 605–614.
- R. Mayer and H.-A. Jacobsen, “Hybrid edge partitioner: Partitioning large power-law graphs under memory constraints,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 1289–1302.
- C. Martella, D. Logothetis, A. Loukas, and G. Siganos, “Spinner: Scalable graph partitioning in the cloud,” in 2017 IEEE 33rd international conference on data engineering (ICDE). Ieee, 2017, pp. 1083–1094.
- D. W. Margo, “Sorting shapes the performance of graph-structured systems,” Ph.D. dissertation, Harvard University, 2017.
- I. Stanton and G. Kliot, “Streaming graph partitioning for large distributed graphs,” in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 1222–1230.
- C. Mayer, R. Mayer, M. A. Tariq, H. Geppert, L. Laich, L. Rieger, and K. Rothermel, “Adwise: Adaptive window-based streaming edge partitioning for high-speed graph processing,” in 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2018, pp. 685–695.
- A. Pacaci and M. T. Özsu, “Experimental analysis of streaming algorithms for graph partitioning,” in Proceedings of the 2019 International Conference on Management of Data, 2019, pp. 1375–1392.
- F. Petroni, L. Querzoni, K. Daudjee, S. Kamali, and G. Iacoboni, “Hdrf: Stream-based partitioning for power-law graphs,” in Proceedings of the 24th ACM international on conference on information and knowledge management, 2015, pp. 243–252.
- W. Fan, R. Xu, Q. Yin, W. Yu, and J. Zhou, “Application-driven graph partitioning,” The VLDB Journal, vol. 32, no. 1, pp. 149–172, 2023.
- F. Bourse, M. Lelarge, and M. Vojnovic, “Balanced graph edge partition,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 1456–1465.
- J. Nishimura and J. Ugander, “Restreaming graph partitioning: simple versatile algorithms for advanced balancing,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013, pp. 1106–1114.
- J. Huang and D. J. Abadi, “Leopard: Lightweight edge-oriented partitioning and replication for dynamic graphs,” Proceedings of the VLDB Endowment, vol. 9, no. 7, 2016.
- W. Fan, M. Liu, C. Tian, R. Xu, and J. Zhou, “Incrementalization of graph partitioning algorithms,” Proceedings of the VLDB Endowment, vol. 13, no. 8, pp. 1261–1274, 2020.
- Z. Abbas, V. Kalavri, P. Carbone, and V. Vlassov, “Streaming graph partitioning: an experimental study,” Proceedings of the VLDB Endowment, vol. 11, no. 11, pp. 1590–1603, 2018.
- R. Albert, H. Jeong, and A.-L. Barabási, “Error and attack tolerance of complex networks,” nature, vol. 406, no. 6794, pp. 378–382, 2000.
- J. M. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. S. Tomkins, “The web as a graph: Measurements, models, and methods,” in Computing and Combinatorics: 5th Annual International Conference, COCOON’99 Tokyo, Japan, July 26–28, 1999 Proceedings 5. Springer, 1999, pp. 1–17.
- J. D. Valois, “Implementing lock-free queues,” in Proceedings of the seventh international conference on Parallel and Distributed Computing Systems. Citeseer, 1994, pp. 64–69.
- “Lock-free queue.” [Online]. Available: https://github.com/cameron314/readerwriterqueue
- J. Kunegis, “KONECT – The Koblenz Network Collection,” in Proc. Int. Conf. on World Wide Web Companion, 2013, pp. 1343–1350. [Online]. Available: http://dl.acm.org/citation.cfm?id=2488173
- O. Erling, A. Averbuch, J. Larriba-Pey, H. Chafi, A. Gubichev, A. Prat, M.-D. Pham, and P. Boncz, “The ldbc social network benchmark: Interactive workload,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015, pp. 619–630.
- “Us-road-dataset.” [Online]. Available: https://networkrepository.com/road-road-usa.php
- G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: a system for large-scale graph processing,” in Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, 2010, pp. 135–146.
- D. Kong, X. Xie, and Z. Zhang, “Clustering-based partitioning for large web graphs,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 593–606.
- Z. Wang, Z. Yang, N. Wang, Y. Du, J. Nie, Z. Wei, Y. Gu, and G. Yu, “Lightweight streaming graph partitioning by fully utilizing knowledge from local view,” in 2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS). IEEE, 2023, pp. 614–625.
- W. Qu, W. Zhang, J. Cheng, C. Zhang, W. Han, B. Bai, C. J. Zhang, L. He, and X. Wang, “Optimizing graph partition by optimal vertex-cut: A holistic approach,” in 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2023, pp. 1019–1031.
- F. Pellegrini and J. Roman, “Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs,” in High-Performance Computing and Networking: International Conference and Exhibition HPCN EUROPE 1996 Brussels, Belgium, April 15–19, 1996 Proceedings 4. Springer, 1996, pp. 493–498.
- M. Hanai, T. Suzumura, W. J. Tan, E. Liu, G. Theodoropoulos, and W. Cai, “Distributed edge partitioning for trillion-edge graphs,” arXiv preprint arXiv:1908.05855, 2019.
- G. M. Slota, C. Root, K. Devine, K. Madduri, and S. Rajamanickam, “Scalable, multi-constraint, complex-objective graph partitioning,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 12, pp. 2789–2801, 2020.
- D. Margo and M. Seltzer, “A scalable distributed graph partitioner,” Proceedings of the VLDB Endowment, vol. 8, no. 12, pp. 1478–1489, 2015.
- N. Merkel, R. Mayer, T. A. Fakir, and H.-A. Jacobsen, “Partitioner selection with ease to optimize distributed graph processing,” in 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2023. [Online]. Available: https://ieeexplore.ieee.org/document/10184652/