Papers
Topics
Authors
Recent
2000 character limit reached

RadixGraph: Dynamic In-Memory Graph

Updated 11 January 2026
  • RadixGraph is a dynamic in-memory graph system that employs a space-optimized radix tree (SORT) for efficient vertex indexing and supports millions of concurrent operations.
  • It uses a hybrid snapshot–log architecture to manage edge storage, enabling rapid edge updates and low-latency query processing.
  • Empirical results show RadixGraph delivers up to 16x higher update throughput and 40% memory savings, highlighting its scalability and efficiency for dynamic workloads.

RadixGraph is a fully in-memory, dynamic graph data structure designed for high-throughput, space-efficient storage and updating of large-scale dynamic graphs. Its architecture is centered on two core innovations: a space-optimized canonical radix tree—SORT—for vertex indexing, and a hybrid snapshot–log layout per vertex for edge storage, which together enable fast vertex and edge updates, scalable concurrency, and compact memory usage. RadixGraph targets dynamic graph workloads in which both query latency and update throughput are critical, supporting millions of concurrent operations per second while achieving substantial memory reductions versus existing systems (Xie et al., 4 Jan 2026).

1. Formal Model and Components

A RadixGraph G=(V,E)G = (V, E) is maintained via two primary tables alongside specialized data structures:

  • Vertex Table (VT): An extensible array of size NN that holds, for each vertex vv, a unique ID, associated metadata, and a pointer to an adjacency (edge) array.
  • SORT (Space-OPTimized Radix Tree): An x-ary radix tree mapping vertex IDs (arbitrary, possibly non-contiguous 64-bit integers) to their corresponding byte offsets in VT.
  • Edge Array per Vertex (EAvEA_v): For each vVTv \in VT, an array of capacity 2deg(v)2\cdot \deg(v), partitioned into a read-only snapshot segment SvS_v (consolidated neighbor list) and a write-only log segment LvL_v for incremental updates.

This organization enables efficient implementations of graph mutator and query operations while minimizing space overhead.

2. SORT: Space-Optimized Radix Tree for Vertex Indexing

SORT is a canonical ll-layer radix tree where each layer ii (for 0i<l0 \leq i < l) splits the incoming vertex ID using a fan-out exponent ai1a_i \geq 1. Each internal SORT node maintains a pointer array of 2ai2^{a_i} entries, and leaf entries map directly to VT offsets. The assignment {a0,a1,...,al1}\{a_0, a_1, ..., a_{l-1}\} is determined by an offline dynamic programming optimizer, minimizing expected pointer-array space subject to i=0l1aix\sum_{i=0}^{l-1} a_i \geq x, where x=log2(U)x=\lceil \log_2(U) \rceil for keyspace UU.

2.1 Algorithmic Operations

Insertion, search, and deletion require O(l)O(l) time and operate by segmenting the input ID’s binary representation into substrings of lengths a0,a1,...,al1a_0, a_1, ..., a_{l-1}. Brief pseudocode for insertion is as follows:

1
2
3
4
5
6
7
8
9
10
function InsertVertex(node N, int depth, bits v_id_bits):
    seg ← top a_depth bits of v_id_bits
    if N.children[seg] == NULL:
        if depth < l-1:
            N.children[seg] ← new internal node with 2^{a_{depth+1}} slots
        else:
            allocate new VT entry at offset off
            N.children[seg] ← off // leaf pointer
            return
    recurse on N.children[seg] with remaining bits

Returns “not found” if any pointer is uninitialized during lookup. Deletion involves marking the corresponding VT entry with an MVCC deletion timestamp and recycling its offset via a lock-free freelist.

2.2 Space Analysis

The expected space for SORT is given by:

i=0l12aiE[#nodes  at  layer  i]\sum_{i=0}^{l-1}2^{a_i} E[\# \mathit{nodes\;at\;layer\;}i]

where aia_i is the fan-out exponent at layer ii. A closed-form solution under uniform key distribution leads to the integer program:

minaiN  i=0l12ai  (1(2x2sin)(2xn))s.t.i=0l1aix,\min_{a_i\in\mathbb N}\; \sum_{i=0}^{l-1} 2^{\,a_i}\;\Bigl(1-\frac{\binom{2^x-2^{\,s_i}}{n}}{\binom{2^x}{n}}\Bigr) \quad\text{s.t.}\quad \sum_{i=0}^{l-1}a_i\ge x,

with si=j=il1ajs_i = \sum_{j=i}^{l-1} a_j. The optimizer solves this in O(nlx2)O(n l x^2) time and O(lx)O(l x) space, yielding in practice an O(n)O(n) memory profile except in pathologically sparse ID cases.

3. Hybrid Snapshot–Log Architecture for Edge Storage

In RadixGraph, every vertex’s adjacency list is realized as a composite array EAu[02d1]EA_u[0 \ldots 2d-1] for d=deg(u)d = \deg(u). The first dd entries comprise the snapshot segment SuS_u, capturing the compacted, immutable neighbor set; the next dd form the write-log LuL_u, which accumulates insertions, deletions, and updates as tuples (dest_off,weight,timestamp)(\mathit{dest\_off}, \mathit{weight}, \mathit{timestamp}). When Lu=d|L_u| = d, a compaction phase merges SuLuS_u \cup L_u into a new snapshot SuS'_u and resets LuL_u.

3.1 Edge Update and Neighbor Scan

Edge insertions, deletions, and weight updates are all append-only into LuL_u and performed via atomic increments of the edge array size. Compactions acquire a per-vertex latch only as necessary. Neighbor-list queries perform a backward scan, outputting the latest valid entry per neighbor not deleted as of the snapshot time. The following pseudocode formalizes insertion:

1
2
3
4
5
6
7
InsertEdge(u→v, w, t):
    off_u ← SORT.lookup(u)
    off_v ← SORT.lookup_or_insert(v)
    idx ← atomic_fetch_add(EA_u.Size, 1)
    EA_u[idx] ← (off_v, w, t)
    if idx+1 == EA_u.Capacity/2:
        compact(EA_u)

Amortized update cost is established as O(1)O(1) due to the bounded compaction cost over the lifespan of edge log insertions.

3.2 Complexity Guarantees

  • Insert, Delete, Update (Edge): Amortized O(1)O(1) per operation.
  • Get Neighbors: O(d)O(d), where dd is vertex degree.
  • Vertex operations (via SORT): O(l)=O(loglogu)O(l) = O(\log\log u), where uu is the ID space.

4. Space and Performance Characteristics

Empirical and analytical results demonstrate the following properties:

  • Update Throughput: Up to 16.27×16.27\times higher than the highest-performing baseline on the twitter-2010 dataset.
  • Memory Efficiency: Achieves an average 40.1%40.1\% reduction in memory usage relative to the closest competing graph store.
  • Analytic Query Speed: Delivers up to 6.11×6.11\times faster 2-hop queries and 1.727.7×1.7{-}27.7\times faster BFS/SSSP operations.
  • Concurrent Scalability: Maintains stable O(1)O(1) latency under intense update and query loads, achieving linear scaling for multi-version concurrency control (MVCC).
  • Total Space: O(m+n)O(m+n), where mm is the edge count and nn the vertex count, comprising:
    • SORT: O(n)O(n) (practically, except for extreme ID sparsity)
    • VT: 32n\leq 32n bytes plus freelist overhead
    • Edges: $20m$ bytes ($8m$ for snapshot + $12m$ for log entries)
    • Duplicate checker: O(tLn/L/8)O(t\,L\,\lceil n/L \rceil/8) bytes (for tt threads, bitmap segment size LL)
Component Practical Memory Usage Asymptotic Bound
SORT O(n)O(n) O(ng)O(n \, g) (worst-case)
Vertex Table 32n\leq 32n bytes O(n)O(n)
Edge Storage $20m$ bytes total O(m)O(m)
Duplicate Check O(tLn/L/8)O(t\,L\,\lceil n/L \rceil/8)

5. Implementation and Concurrency Design

RadixGraph is realized using modern concurrency primitives and open-source libraries:

  • Intel TBB concurrent_vector powers VT and SORT for efficient, thread-safe segment-doubling.
  • ROWEX-style atomic bitmaps enable lock-free concurrent reads with CAS-synchronized writes.
  • Per-node and per-vertex latching are reserved for infrequent compactions; read operations require no lock acquisition.
  • Multi-version edge arrays create a singly linked version chain for snapshot queries at timestamp tt, supporting both read-committed and snapshot isolation levels in MVCC.
  • Source code and technical documentation are publicly available at [https://github.com/ForwardStar/RadixGraph].

6. Limitations and Open Challenges

Several avenues for improvement and extension are identified:

  • Transactional Semantics: Only MVCC with read-committed and snapshot isolation is provided; fully serializable transactions are not yet supported.
  • Adaptivity to Skewed ID Distributions: Enhancements are possible via more localized re-optimization of SORT parameters under non-uniform ID assignment.
  • Edge Array Deletion Overhead: Work remains on log-size tuning and space reclamation strategies for delete-heavy workloads.
  • Persistent and Tiered Storage: Integration of an on-disk or hybrid storage tier for scaling beyond main memory is under exploration.

A plausible implication is that further adaptation of SORT to heterogeneous workload characteristics, along with deeper integration into tiered or distributed systems, could extend RadixGraph’s applicability to new domains within large-scale dynamic graph management (Xie et al., 4 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to RadixGraph.