GastCoCo: Graph Storage and Coroutine-Based Prefetch Co-Design for Dynamic Graph Processing (2312.14396v4)
Abstract: An efficient data structure is fundamental to meeting the growing demands in dynamic graph processing. However, the dual requirements for graph computation efficiency (with contiguous structures) and graph update efficiency (with linked list-like structures) present a conflict in the design principles of graph structures. After experimental studies of existing state-of-the-art dynamic graph structures, we observe that the overhead of cache misses accounts for a major portion of the graph computation time. This paper presents GastCoCo, a system with graph storage and coroutine-based prefetch co-design. By employing software prefetching via stackless coroutines and introducing a prefetch-friendly data structure CBList, GastCoCo significantly alleviates the performance degradation caused by cache misses. Our results show that GastCoCo outperforms state-of-the-art graph storage systems by 1.3x - 180x in graph updates and 1.4x - 41.1x in graph computation.
- Streaming Graph Partitioning: An Experimental Study. Proc. VLDB Endow. 11, 11 (2018), 1590–1603. https://doi.org/10.14778/3236187.3236208
- Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server. In 2015 IEEE International Symposium on Workload Characterization, IISWC 2015, Atlanta, GA, USA, October 4-6, 2015. IEEE Computer Society, 56–65. https://doi.org/10.1109/IISWC.2015.12
- Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems. IEEE Trans. Parallel Distributed Syst. 28, 7 (2017), 2031–2045. https://doi.org/10.1109/TPDS.2016.2634535
- Pointer cache assisted prefetching. In Proceedings of the 35th Annual International Symposium on Microarchitecture, Istanbul, Turkey, November 18-22, 2002, Erik R. Altman, Kemal Ebcioglu, Scott A. Mahlke, B. Ramakrishna Rau, and Sanjay J. Patel (Eds.). ACM/IEEE Computer Society, 62–73. https://doi.org/10.1109/MICRO.2002.1176239
- Douglas Comer. 1979. The Ubiquitous B-Tree. ACM Comput. Surv. 11, 2 (1979), 121–137. https://doi.org/10.1145/356770.356776
- Low-latency graph streaming using compressed purely-functional trees. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, Phoenix, AZ, USA, June 22-26, 2019, Kathryn S. McKinley and Kathleen Fisher (Eds.). ACM, 918–934. https://doi.org/10.1145/3314221.3314598
- Alibaba Sponsor Talk at VLDB.
- STINGER: High performance data structure for streaming graphs. (2012), 1–5. https://doi.org/10.1109/HPEC.2012.6408680
- Parallel Graph Processing: Prejudice and State of the Art. In Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering, ICPE 2016, Delft, The Netherlands, March 12-16, 2016, Alberto Avritzer, Alexandru Iosup, Xiaoyun Zhu, and Steffen Becker (Eds.). ACM, 85–90. https://doi.org/10.1145/2851553.2851572
- Babak Falsafi and Thomas F. Wenisch. 2014. A Primer on Hardware Prefetching. Morgan & Claypool Publishers. https://doi.org/10.2200/S00581ED1V01Y201405CAC028
- Parallelizing Sequential Graph Computations. ACM Trans. Database Syst. 43, 4 (2018), 18:1–18:39. https://doi.org/10.1145/3282488
- RisGraph: A Real-Time Streaming System for Evolving Graphs. CoRR abs/2004.00803 (2020). arXiv:2004.00803 https://arxiv.org/abs/2004.00803
- PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8-10, 2012, Chandu Thekkath and Amin Vahdat (Eds.). USENIX Association, 17–30. https://www.usenix.org/conference/osdi12/technical-sessions/presentation/gonzalez
- Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs. Proc. VLDB Endow. 7, 13 (2014), 1379–1380. https://doi.org/10.14778/2733004.2733010
- CoroBase: Coroutine-Oriented Main-Memory Database Engine. Proc. VLDB Endow. 14, 3 (2020), 431–444. https://doi.org/10.5555/3430915.3442440
- ISO/IEC. 2017. Technical Specification — C++ Extensions for Coroutines. https://www.iso.org/standard/73008.html.
- GraphBuilder: scalable graph ETL framework. In First International Workshop on Graph Data Management Experiences and Systems, GRADES 2013, co-located with SIGMOD/PODS 2013, New York, NY, USA, June 24, 2013, Peter A. Boncz and Thomas Neumann (Eds.). CWI/ACM, 4. https://doi.org/10.1145/2484425.2484429
- Exploiting Coroutines to Attack the ”Killer Nanoseconds”. Proc. VLDB Endow. 11, 11 (2018), 1702–1714. https://doi.org/10.14778/3236187.3236216
- Pradeep Kumar and H. Howie Huang. 2020. GraphOne: A Data Store for Real-time Analytics on Evolving Graphs. ACM Trans. Storage 15, 4 (2020), 29:1–29:40. https://doi.org/10.1145/3364180
- Dean De Leo and Peter A. Boncz. 2021. Teseo and the Analysis of Structural Dynamic Graphs. Proc. VLDB Endow. 14, 6 (2021), 1053–1066. https://doi.org/10.14778/3447689.3447708
- Chi-Keung Luk and Todd C. Mowry. 1996. Compiler-Based Prefetching for Recursive Data Structures. In ASPLOS-VII Proceedings - Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, Massachusetts, USA, October 1-5, 1996, Bill Dally and Susan J. Eggers (Eds.). ACM Press, 222–233. https://doi.org/10.1145/237090.237190
- LLAMA: Efficient graph analytics using Large Multiversioned Arrays. In 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13-17, 2015, Johannes Gehrke, Wolfgang Lehner, Kyuseok Shim, Sang Kyun Cha, and Guy M. Lohman (Eds.). IEEE Computer Society, 363–374. https://doi.org/10.1109/ICDE.2015.7113298
- Anil Pacaci and M. Tamer Özsu. 2019. Experimental Analysis of Streaming Algorithms for Graph Partitioning. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1375–1392. https://doi.org/10.1145/3299869.3300076
- Winograd., T. The pagerank citation ranking: bringing order to the web. Unpublished manuscript (1998).
- Terrace: A Hierarchical Graph Container for Skewed Dynamic Graphs. In SIGMOD ’21: International Conference on Management of Data, Virtual Event, China, June 20-25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 1372–1385. https://doi.org/10.1145/3448016.3457313
- HDRF: Stream-Based Partitioning for Power-Law Graphs. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015, James Bailey, Alistair Moffat, Charu C. Aggarwal, Maarten de Rijke, Ravi Kumar, Vanessa Murdock, Timos K. Sellis, and Jeffrey Xu Yu (Eds.). ACM, 243–252. https://doi.org/10.1145/2806416.2806424
- Interleaving with Coroutines: A Practical Approach for Robust Index Joins. Proc. VLDB Endow. 11, 2 (2017), 230–242. https://doi.org/10.14778/3149193.3149202
- Real-time Constrained Cycle Detection in Large Dynamic Graphs. Proc. VLDB Endow. 11, 12 (2018), 1876–1888. https://doi.org/10.14778/3229863.3229874
- Dependence based prefetching for linked data structures. In Proceedings of the eighth international conference on Architectural support for programming languages and operating systems. https://doi.org/10.1145/291069.291034
- Amir Roth and Gurindar S. Sohi. 1999. Effective Jump-Pointer Prefetching for Linked Data Structures. In Proceedings of the 26th Annual International Symposium on Computer Architecture, ISCA 1999, Atlanta, Georgia, USA, May 2-4, 1999, Allan Gottlieb and William J. Dally (Eds.). IEEE Computer Society, 111–121. https://doi.org/10.1109/ISCA.1999.765944
- The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing. Proc. VLDB Endow. 11, 4 (2017), 420–431. https://doi.org/10.1145/3186728.3164139
- GraphJet: Real-Time Content Recommendations at Twitter. Proc. VLDB Endow. 9, 13 (2016), 1281–1292. https://doi.org/10.14778/3007263.3007267
- Isabelle Stanton and Gabriel Kliot. 2012. Streaming graph partitioning for large distributed graphs. In The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, Beijing, China, August 12-16, 2012, Qiang Yang, Deepak Agarwal, and Jian Pei (Eds.). ACM, 1222–1230. https://doi.org/10.1145/2339530.2339722
- FENNEL: streaming graph partitioning for massive scale graphs. In Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014, New York, NY, USA, February 24-28, 2014, Ben Carterette, Fernando Diaz, Carlos Castillo, and Donald Metzler (Eds.). ACM, 333–342. https://doi.org/10.1145/2556195.2556213
- Speedup Graph Processing by Graph Ordering. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). ACM, 1813–1828. https://doi.org/10.1145/2882903.2915220
- Brian Wheatman and Helen Xu. 2018. Packed Compressed Sparse Row: A Dynamic Graph Representation. In 2018 IEEE High Performance Extreme Computing Conference, HPEC 2018, Waltham, MA, USA, September 25-27, 2018. IEEE, 1–7. https://doi.org/10.1109/HPEC.2018.8547566
- Distributed Power-law Graph Computing: Theoretical and Empirical Analysis. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (Eds.). 1673–1681. https://proceedings.neurips.cc/paper/2014/hash/67d16d00201083a2b118dd5128dd6f59-Abstract.html
- Graph Edge Partitioning via Neighborhood Heuristic. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017. ACM, 605–614. https://doi.org/10.1145/3097983.3098033
- Gemini: A Computation-Centric Distributed Graph Processing System. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016, Kimberly Keeton and Timothy Roscoe (Eds.). USENIX Association, 301–316. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/zhu
- LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans. Proc. VLDB Endow. 13, 7 (2020), 1020–1034. https://doi.org/10.14778/3384345.3384351