RadixGraph: Dynamic In-Memory Graph

Updated 11 January 2026

RadixGraph is a dynamic in-memory graph system that employs a space-optimized radix tree (SORT) for efficient vertex indexing and supports millions of concurrent operations.
It uses a hybrid snapshot–log architecture to manage edge storage, enabling rapid edge updates and low-latency query processing.
Empirical results show RadixGraph delivers up to 16x higher update throughput and 40% memory savings, highlighting its scalability and efficiency for dynamic workloads.

RadixGraph is a fully in-memory, dynamic graph data structure designed for high-throughput, space-efficient storage and updating of large-scale dynamic graphs. Its architecture is centered on two core innovations: a space-optimized canonical radix tree—SORT—for vertex indexing, and a hybrid snapshot–log layout per vertex for edge storage, which together enable fast vertex and edge updates, scalable concurrency, and compact memory usage. RadixGraph targets dynamic graph workloads in which both query latency and update throughput are critical, supporting millions of concurrent operations per second while achieving substantial memory reductions versus existing systems (Xie et al., 4 Jan 2026).

1. Formal Model and Components

A RadixGraph $G = (V, E)$ is maintained via two primary tables alongside specialized data structures:

Vertex Table (VT): An extensible array of size $N$ that holds, for each vertex $v$ , a unique ID, associated metadata, and a pointer to an adjacency (edge) array.
SORT (Space-OPTimized Radix Tree): An x-ary radix tree mapping vertex IDs (arbitrary, possibly non-contiguous 64-bit integers) to their corresponding byte offsets in VT.
Edge Array per Vertex ( $EA_v$ ): For each $v \in VT$ , an array of capacity $2\cdot \deg(v)$ , partitioned into a read-only snapshot segment $S_v$ (consolidated neighbor list) and a write-only log segment $L_v$ for incremental updates.

This organization enables efficient implementations of graph mutator and query operations while minimizing space overhead.

2. SORT: Space-Optimized Radix Tree for Vertex Indexing

SORT is a canonical $l$ -layer radix tree where each layer $i$ (for $0 \leq i < l$ ) splits the incoming vertex ID using a fan-out exponent $a_i \geq 1$ . Each internal SORT node maintains a pointer array of $2^{a_i}$ entries, and leaf entries map directly to VT offsets. The assignment $\{a_0, a_1, ..., a_{l-1}\}$ is determined by an offline dynamic programming optimizer, minimizing expected pointer-array space subject to $\sum_{i=0}^{l-1} a_i \geq x$ , where $x=\lceil \log_2(U) \rceil$ for keyspace $U$ .

2.1 Algorithmic Operations

Insertion, search, and deletion require $O(l)$ time and operate by segmenting the input ID’s binary representation into substrings of lengths $a_0, a_1, ..., a_{l-1}$ . Brief pseudocode for insertion is as follows:

function InsertVertex(node N, int depth, bits v_id_bits):
    seg ← top a_depth bits of v_id_bits
    if N.children[seg] == NULL:
        if depth < l-1:
            N.children[seg] ← new internal node with 2^{a_{depth+1}} slots
        else:
            allocate new VT entry at offset off
            N.children[seg] ← off // leaf pointer
            return
    recurse on N.children[seg] with remaining bits

Returns “not found” if any pointer is uninitialized during lookup. Deletion involves marking the corresponding VT entry with an MVCC deletion timestamp and recycling its offset via a lock-free freelist.

2.2 Space Analysis

The expected space for SORT is given by:

$\sum_{i=0}^{l-1}2^{a_i} E[\# \mathit{nodes\;at\;layer\;}i]$

where $a_i$ is the fan-out exponent at layer $i$ . A closed-form solution under uniform key distribution leads to the integer program:

$\min_{a_i\in\mathbb N}\; \sum_{i=0}^{l-1} 2^{\,a_i}\;\Bigl(1-\frac{\binom{2^x-2^{\,s_i}}{n}}{\binom{2^x}{n}}\Bigr) \quad\text{s.t.}\quad \sum_{i=0}^{l-1}a_i\ge x,$

with $s_i = \sum_{j=i}^{l-1} a_j$ . The optimizer solves this in $O(n l x^2)$ time and $O(l x)$ space, yielding in practice an $O(n)$ memory profile except in pathologically sparse ID cases.

3. Hybrid Snapshot–Log Architecture for Edge Storage

In RadixGraph, every vertex’s adjacency list is realized as a composite array $EA_u[0 \ldots 2d-1]$ for $d = \deg(u)$ . The first $d$ entries comprise the snapshot segment $S_u$ , capturing the compacted, immutable neighbor set; the next $d$ form the write-log $L_u$ , which accumulates insertions, deletions, and updates as tuples $(\mathit{dest\_off}, \mathit{weight}, \mathit{timestamp})$ . When $|L_u| = d$ , a compaction phase merges $S_u \cup L_u$ into a new snapshot $S'_u$ and resets $L_u$ .

3.1 Edge Update and Neighbor Scan

Edge insertions, deletions, and weight updates are all append-only into $L_u$ and performed via atomic increments of the edge array size. Compactions acquire a per-vertex latch only as necessary. Neighbor-list queries perform a backward scan, outputting the latest valid entry per neighbor not deleted as of the snapshot time. The following pseudocode formalizes insertion:

InsertEdge(u→v, w, t):
    off_u ← SORT.lookup(u)
    off_v ← SORT.lookup_or_insert(v)
    idx ← atomic_fetch_add(EA_u.Size, 1)
    EA_u[idx] ← (off_v, w, t)
    if idx+1 == EA_u.Capacity/2:
        compact(EA_u)

Amortized update cost is established as $O(1)$ due to the bounded compaction cost over the lifespan of edge log insertions.

3.2 Complexity Guarantees

Insert, Delete, Update (Edge): Amortized $O(1)$ per operation.
Get Neighbors: $O(d)$ , where $d$ is vertex degree.
Vertex operations (via SORT): $O(l) = O(\log\log u)$ , where $u$ is the ID space.

4. Space and Performance Characteristics

Empirical and analytical results demonstrate the following properties:

Update Throughput: Up to $16.27\times$ higher than the highest-performing baseline on the twitter-2010 dataset.
Memory Efficiency: Achieves an average $40.1\%$ reduction in memory usage relative to the closest competing graph store.
Analytic Query Speed: Delivers up to $6.11\times$ faster 2-hop queries and $1.7{-}27.7\times$ faster BFS/SSSP operations.
Concurrent Scalability: Maintains stable $O(1)$ latency under intense update and query loads, achieving linear scaling for multi-version concurrency control (MVCC).
Total Space: $O(m+n)$ $O (m + n)$ , where $m$ $m$ is the edge count and $n$ $n$ the vertex count, comprising:
- SORT: $O(n)$ (practically, except for extreme ID sparsity)
- VT: $\leq 32n$ bytes plus freelist overhead
- Edges: $20m$ bytes ($8m$ for snapshot + $12m$ for log entries)
- Duplicate checker: $O(t\,L\,\lceil n/L \rceil/8)$ bytes (for $t$ threads, bitmap segment size $L$ )

Component	Practical Memory Usage	Asymptotic Bound
SORT	$O(n)$	$O(n \, g)$ (worst-case)
Vertex Table	$\leq 32n$ bytes	$O(n)$
Edge Storage	$20m$ bytes total	$O(m)$
Duplicate Check	$O(t\,L\,\lceil n/L \rceil/8)$	—

5. Implementation and Concurrency Design

RadixGraph is realized using modern concurrency primitives and open-source libraries:

Intel TBB concurrent_vector powers VT and SORT for efficient, thread-safe segment-doubling.
ROWEX-style atomic bitmaps enable lock-free concurrent reads with CAS-synchronized writes.
Per-node and per-vertex latching are reserved for infrequent compactions; read operations require no lock acquisition.
Multi-version edge arrays create a singly linked version chain for snapshot queries at timestamp $t$ , supporting both read-committed and snapshot isolation levels in MVCC.
Source code and technical documentation are publicly available at [https://github.com/ForwardStar/RadixGraph].

6. Limitations and Open Challenges

Several avenues for improvement and extension are identified:

Transactional Semantics: Only MVCC with read-committed and snapshot isolation is provided; fully serializable transactions are not yet supported.
Adaptivity to Skewed ID Distributions: Enhancements are possible via more localized re-optimization of SORT parameters under non-uniform ID assignment.
Edge Array Deletion Overhead: Work remains on log-size tuning and space reclamation strategies for delete-heavy workloads.
Persistent and Tiered Storage: Integration of an on-disk or hybrid storage tier for scaling beyond main memory is under exploration.

A plausible implication is that further adaptation of SORT to heterogeneous workload characteristics, along with deeper integration into tiered or distributed systems, could extend RadixGraph’s applicability to new domains within large-scale dynamic graph management (Xie et al., 4 Jan 2026).

PDF Markdown Chat (Pro)

References (1)

RadixGraph: A Fast, Space-Optimized Data Structure for Dynamic Graph Storage (Extended Version) (2026)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to RadixGraph.