GTX: A Write-Optimized Latch-free Graph Data System with Transactional Support -- Extended Version
Abstract: This paper introduces GTX, a standalone main-memory write-optimized graph data system that specializes in structural and graph property updates while enabling concurrent reads and graph analytics through ACID transactions. Recent graph systems target concurrent read and write support while guaranteeing transaction semantics. However, their performance suffers from updates with real-world temporal locality over the same vertices and edges due to vertex-centric lock contentions. GTX has an adaptive delta-chain locking protocol on top of a carefully designed latch-free graph storage. It eliminates vertex-level locking contention, and adapts to real-life workloads while maintaining sequential access to the graph's adjacency lists storage. GTX's transactions further support cache-friendly block level concurrency control, and cooperative group commit and garbage collection. This combination of features ensures high update throughput and provides low-latency graph analytics. Based on experimental evaluation, in addition to not sacrificing the performance of read-heavy analytical workloads, and having competitive performance similar to state-of-the-art systems, GTX has high read-write transaction throughput. For write-heavy transactional workloads, GTX achieves up to 11x better transaction throughput than the best-performing state-of-the-art system.
- [n.d.]. China’s Singles’ Day shopping spree sees robust sales. http://www.xinhuanet.com/english/2019-11/11/c_138546429.htm
- [n.d.]. JanusGraph. https://janusgraph.org/
- [n.d.]. Neofj. https://neo4j.com/
- [n.d.]. New Tweets per second record, and how! https://blog.twitter.com/engineering/en_us/a/2013/new-tweets-per-second-record-and-how
- [n.d.]. OpenMP. https://www.openmp.org/
- [n.d.]. OrientDB. https://orientdb.org/
- 2023. ByteDance. https://www.bytedance.com/en/
- 2024. Get Started with SAP HANA Graph. https://developers.sap.com/group.hana-aa-graph-overview.html
- 2024. OQGRAPH Overview. https://mariadb.com/kb/en/oqgraph-overview/
- 2024. Oracle Big Data Spatial and Graph. https://www.oracle.com/database/technologies/bigdata-spatialandgraph.html
- Wing Lung Ngai Stijn Heldens Arnau Prat-Pérez Thomas Manhardto Hassan Chafio Mihai Capotă Narayanan Sundaram Michael Anderson Ilie Gabriel Tănase Yinglong Xia Lifeng Nai Alexandru Iosup, Tim Hegeman and Peter Boncz. 2017. LDBC Graphalytics Benchmark specification, v0.9.0.
- LinkBench: A Database Benchmark Based on the Facebook Social Graph (SIGMOD ’13). https://doi.org/10.1145/2463676.2465296
- Bztree: A High-Performance Latch-Free Range Index for Non-Volatile Memory. Proc. VLDB Endow. 11, 5 (2018). https://doi.org/10.1145/3164135.3164147
- Greg Barnes. 1993. A Method for Implementing Lock-Free Shared-Data Structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures (Velen, Germany) (SPAA ’93). Association for Computing Machinery, New York, NY, USA, 261–270. https://doi.org/10.1145/165231.165265
- A critique of ANSI SQL isolation levels. SIGMOD Rec. 24, 2 (may 1995), 1–10. https://doi.org/10.1145/568271.223785
- The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores. arXiv:2305.11162 [cs.DB]
- Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries. ACM Comput. Surv. 56, 2, Article 31 (sep 2023), 40 pages. https://doi.org/10.1145/3604932
- Kai Zeng Bolin Ding and Wenyuan Yu. 2020. Alibaba Sponsor Talk at VLDB.
- A1: A Distributed In-Memory Graph Database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD ’20). 329–344. https://doi.org/10.1145/3318464.3386135
- G-Tran: A High Performance Distributed Graph Database with a Decentralized Architecture. Proc. VLDB Endow. 15, 11 (jul 2022), 2545–2558. https://doi.org/10.14778/3551793.3551813
- PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs. ACM Trans. Parallel Comput. (2019). https://doi.org/10.1145/3298989
- TAOBench: an end-to-end benchmark for social network workloads. Proc. VLDB Endow. 15, 9 (2022). https://doi.org/10.14778/3538598.3538616
- Mammoths Are Slow: The Overlooked Transactions of Graph Data. ([n. d.]).
- ByteGAP: A Non-continuous Distributed Graph Computing System using Persistent Memory (CEUR Workshop Proceedings). CEUR-WS.org. https://ceur-ws.org/Vol-3462/ADMS7.pdf
- Dean De Leo. [n.d.]. graphlog. https://github.com/whatsthecraic/graphlog
- Dean De Leo and Peter Boncz. 2021. Teseo and the Analysis of Structural Dynamic Graphs. 14, 6 (feb 2021), 1053–1066. https://doi.org/10.14778/3447689.3447708
- Low-Latency Graph Streaming Using Compressed Purely-Functional Trees. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019). https://doi.org/10.1145/3314221.3314598
- Weaver: A High-Performance, Transactional Graph Database Based on Refinable Timestamps. Proc. VLDB Endow. 9, 11 (jul 2016), 852–863. https://doi.org/10.14778/2983200.2983202
- STINGER: High performance data structure for streaming graphs. In 2012 IEEE Conference on High Performance Extreme Computing. 1–5. https://doi.org/10.1109/HPEC.2012.6408680
- On Power-Law Relationships of the Internet Topology (SIGCOMM ’99). https://doi.org/10.1145/316188.316229
- GraphScope: a unified engine for big graph processing. 14, 12 (jul 2021), 2879–2892. https://doi.org/10.14778/3476311.3476369
- RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-Millisecond Per-Update Analysis at Millions Ops/s (SIGMOD ’21). https://doi.org/10.1145/3448016.3457263
- KÙZU Graph Database Management System. CIDR.
- CSR++: A Fast, Scalable, Update-Friendly Graph Data Structure. https://doi.org/10.4230/LIPIcs.OPODIS.2020.17
- Sortledton: A Universal, Transactional Graph Data Structure. Proc. VLDB Endow. 15, 6 (feb 2022), 1173–1186. https://doi.org/10.14778/3514061.3514065
- Evolution of an Online Social Aggregation Network: An Empirical Study. In Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement (IMC ’09). https://doi.org/10.1145/1644893.1644931
- Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs. Proc. VLDB Endow. 7, 13 (aug 2014). https://doi.org/10.14778/2733004.2733010
- Extending In-Memory Relational Database Engines with Native Graph Support. In International Conference on Extending Database Technology. https://api.semanticscholar.org/CorpusID:11389988
- Group Commit Timers and High Volume Transaction Systems. 301–329. https://doi.org/10.1007/3-540-51085-0_52
- Jim Webber Ian Robinson and Emil Eifrem. 2015. Graph Databases: New Opportunities for Connected Data (2nd ed.). O’Reilly Media, Inc.
- LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms. 9, 13 (2016), 12. https://doi.org/10.14778/3007263.3007270
- Fast and Efficient Update Handling for Graph H2TAP. In Proceedings 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, March 28-31, 2023. OpenProceedings.org, 723–736. https://doi.org/10.48786/edbt.2023.60
- Authentication graphs: Analyzing user behavior within an enterprise network. Computers & Security 48 (2015), 150–166. https://doi.org/10.1016/j.cose.2014.09.001
- Kenneth C. Knowlton. 1965. A Fast Storage Allocator. Commun. ACM 8, 10 (oct 1965), 623–624. https://doi.org/10.1145/365628.365655
- Pradeep Kumar and H. Howie Huang. 2020. GraphOne: A Data Store for Real-Time Analytics on Evolving Graphs. ACM Trans. Storage 15, 4 (2020). https://doi.org/10.1145/3364180
- Jérôme Kunegis. [n.d.]. The KONECT Project. http://konect.cc/
- Geof Langdale. [n.d.]. Lock-Free Programming. https://www.cs.cmu.edu/~410-s05/lectures/L31_LockFree.pdf
- Jure Leskovec and Rok Sosič. 2016. SNAP: A General-Purpose Network Analysis and Graph-Mining Library. ACM Trans. Intell. Syst. Technol. 8, 1 (2016). https://doi.org/10.1145/2898361
- The Bw-Tree: A B-tree for New Hardware Platforms. In 2013 IEEE 29th International Conference on Data Engineering (ICDE) (2013 ieee 29th international conference on data engineering (icde) ed.). IEEE. https://www.microsoft.com/en-us/research/publication/the-bw-tree-a-b-tree-for-new-hardware/
- High Performance Transactions in Deuteronomy. In Conference on Innovative Data Systems Research (CIDR 2015). https://www.microsoft.com/en-us/research/publication/high-performance-transactions-in-deuteronomy/
- ByteGraph: A High-Performance Distributed Graph Database in ByteDance. Proc. VLDB Endow. 15, 12 (2022). https://doi.org/10.14778/3554821.3554824
- Performant Almost-Latch-Free Data Structures Using Epoch Protection. In Data Management on New Hardware (Philadelphia, PA, USA) (DaMoN’22). Association for Computing Machinery, New York, NY, USA, Article 1, 10 pages. https://doi.org/10.1145/3533737.3535091
- LLAMA: Efficient graph analytics using Large Multiversioned Arrays. In 2015 IEEE 31st International Conference on Data Engineering. 363–374. https://doi.org/10.1109/ICDE.2015.7113298
- Terrace: A Hierarchical Graph Container for Skewed Dynamic Graphs (SIGMOD ’21). https://doi.org/10.1145/3448016.3457313
- Concurrent Unrolled Skiplist. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). https://doi.org/10.1109/ICDCS.2019.00157
- Real-Time Constrained Cycle Detection in Large Dynamic Graphs. 11, 12 (2018). https://doi.org/10.14778/3229863.3229874
- The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing. Proc. VLDB Endow. 11, 4 (2017). https://doi.org/10.1145/3186728.3164139
- GraphJet: Real-Time Content Recommendations at Twitter. Proc. VLDB Endow. 9, 13 (2016). https://doi.org/10.14778/3007263.3007267
- Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, 219–238. https://www.usenix.org/conference/osdi21/presentation/shen
- Bridging the Gap between Relational OLTP and Graph-based OLAP. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). USENIX Association. https://www.usenix.org/conference/atc23/presentation/shen
- Spruce: a Fast yet Space-saving Structure for Dynamic Graph Storage. Proc. ACM Manag. Data 2, 1, Article 27 (mar 2024), 26 pages. https://doi.org/10.1145/3639282
- The topology of interbank payment flows. Physica A: Statistical Mechanics and its Applications (2007). https://doi.org/10.1016/j.physa.2006.11.093
- Building a Bw-Tree Takes More Than Just Buzz Words (SIGMOD ’18). Association for Computing Machinery, New York, NY, USA, 473–488. https://doi.org/10.1145/3183713.3196895
- Todd Warszawski and Peter Bailis. 2017. ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD ’17). Association for Computing Machinery, New York, NY, USA, 5–20. https://doi.org/10.1145/3035918.3064037
- Preserving reciprocal consistency in distributed graph databases. In Proceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC ’20). Association for Computing Machinery. https://doi.org/10.1145/3380787.3393675
- Architecture-Intact Oracle for Fastest Path and Time Queries on Dynamic Spatial Networks (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3318464.3389718
- Brian Wheatman and Randal Burns. 2021. Streaming Sparse Graphs using Efficient Dynamic Sets. In 2021 IEEE International Conference on Big Data (Big Data). 284–294. https://doi.org/10.1109/BigData52589.2021.9671836
- Brian Wheatman and Helen Xu. 2018. Packed Compressed Sparse Row: A Dynamic Graph Representation. In 2018 IEEE High Performance extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC.2018.8547566
- DuckPGQ: Bringing SQL/PGQ to DuckDB. Proc. VLDB Endow. 16, 12 (aug 2023), 4034–4037. https://doi.org/10.14778/3611540.3611614
- An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. Proc. VLDB Endow. 10, 7 (2017). https://doi.org/10.14778/3067421.3067427
- Quadboost: A Scalable Concurrent Quadtree. IEEE Transactions on Parallel & Distributed Systems 29, 03 (2018). https://doi.org/10.1109/TPDS.2017.2762298
- LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans. 13, 7 (mar 2020), 1020–1034. https://doi.org/10.14778/3384345.3384351
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.