Papers
Topics
Authors
Recent
Search
2000 character limit reached

Kanva: Lock-Free Learned Search Framework

Updated 17 February 2026
  • Kanva is a framework for learned, lock-free search data structures that integrate piecewise-linear models with non-blocking synchronization.
  • It organizes data into a shallow hierarchy of MNodes and dynamic bins, enabling rapid, unique, model-guided search paths.
  • Evaluations show Kanva outperforms traditional structures like E-ABT and C-IST in throughput and cache efficiency under diverse workloads.

Kanva is a framework for learned, lock-free search data structures designed to deliver high scalability and strong progress guarantees on multi-core architectures. It integrates piecewise-linear learned models for rapid key prediction with provably non-blocking synchronization, achieving significant throughput and cache efficiency improvements over prior non-blocking search structures. Kanva’s core innovation lies in its shallow hierarchy of lightweight modelled nodes (MNodes) and dynamic, lock-free bins that combine learned arithmetic search with fully linearizable concurrency semantics (Bhardwaj et al., 2023).

1. Structural Design and Search Pathways

Kanva organizes the key space as a shallow, unbalanced tree comprising two node types:

  • MNodes (“modelled nodes”): Internal nodes, each containing one or more tiny linear models to predict the subrange for a given search key.
  • Bins: Leaf nodes, initially implemented as sorted linked lists (one-level), which absorb all insert, delete, and update operations. When overloaded, bins upgrade to two-level arrays of linked-list pages and, upon reaching a fixed threshold, are “frozen” and converted into new MNodes via lock-free retraining.

Each root MNode contains an array of piecewise-linear models approximating the cumulative distribution function (CDF) of the keys, associated split-points, and child pointers (some null-initialized). Non-root MNodes hold a sorted array keys[], parallel versioned-value lists versions[], a single linear model (a,b,ϵ)(a, b, \epsilon), and B+1B+1 child pointers. Traversals proceed top-down, following a unique path per search from the root to a leaf bin, guided by the model predictions rather than comparator-driven tree walks. The absence of restructuring of ancestor nodes after installation guarantees unique traversals.

2. Learned Query Formulation and Model Fitting

At each modelled node, Kanva’s rank-prediction mechanism approximates the true rank-CDF F(k)F(k) of key kk over the sorted dataset of size NN by a fitted linear model: F~(k)=ak+b\tilde F(k) = a \cdot k + b yielding a predicted rank

r^(k)=F~(k)×N\hat r(k) = \tilde F(k) \times N

such that, for all kk in the segment,

r^(k)F(k)Nϵ\left| \hat r(k) - F(k)N \right| \leq \epsilon

where ϵ\epsilon is the maximal regression error at training time.

Model parameters aa and bb are computed in one pass via standard linear regression statistics: a=NxyxyNx2(x)2,b=yaxNa = \frac{N \sum x y - \sum x \sum y}{N \sum x^2 - (\sum x)^2}, \quad b = \frac{\sum y - a \sum x}{N} with

ϵ=maxkr^(k)y(k)\epsilon = \max_k |\hat r(k) - y(k)|

In practice, this “fetch-and-add–free” scheme fits each node’s model within a few hundred nanoseconds even on 64-thread platforms, and ϵ\epsilon is reported within 5–40 keys of optimum for real datasets.

3. Concurrency Architecture and Linearizability

Kanva ensures non-blocking progress, leveraging only single-word compare-and-swap (CAS) operations for all MNode and child updates. When a bin surpasses its capacity threshold BB, the first thread to observe this condition “freezes” it using atomic pointer tagging (bit-stealing), initiating a lock-free conversion process in which all threads encountering the frozen bin help promote it to an MNode. Reads (searches) never block and traverse the tree following model predictions without participating in conversion or helping.

Key invariants established by the design:

  • Unique Path: Once installed, models and node structures are immutable, preserving a fixed traversal route from root to bin for each search key.
  • No Data Loss: Each inserted key–value remains accessible via versioned-value lists, even through bin–MNode conversions.
  • Range-Scan Consistency: Range queries utilize versioned timestamps, guaranteeing snapshot isolation for all values committed before query start.

For each operation, there is a precise linearization point: inserts and deletes linearize at successful CAS steps, searches at the highest versioned-value read with timestamp ≤ invocation time, and ranges at atomic timestamp reads. Following Herlihy’s argument, lock-freedom is assured as threads either complete their operation or help another; thus, some operation always completes in finite steps.

4. Core Algorithms

Kanva’s primary algorithms are summarized below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
function Seek(key, node):
  ix ← searchInMNode(node, key)  // exponential-search + model
  if node.keys[ix] == key:
    return (node, ix, FOUND)
  child ← node.children[ix+1]
  if child == null:
    return (node, ix, NFOUND)
  if child is Bin:
    return (node, ix, MAYBE)
  else:
    return Seek(key, child)

procedure Insert(key, val):
  retry:
    (node, ix, st) ← Seek(key, root)
    if st==FOUND:
      return writeValue(node, ix, val)
    elseif st==NFOUND:
      newB ← Bin(key, val)
      if CAS(node.children[ix], null, newB): return true
      else goto retry
    else:  // MAYBE
      b ← node.children[ix]
      if b.size≥ B or b.frozen: helpConvert(node,ix,b); goto retry
      res ← b.insertBin(key,val)
      if res==HELP: helpConvert(node,ix,b); goto retry
      return res

procedure Delete(key):
  retry:
    (node, ix, st) ← Seek(key, root)
    if st==FOUND:
      return writeValue(node, ix, null)
    elseif st==NFOUND:
      return false
    else:
      b ← node.children[ix]
      res ← b.deleteBin(key)
      if res==HELP: helpConvert(node,ix,b); goto retry
      return res

function Search(key):
  (node, ix, st) ← Seek(key, root)
  if st==FOUND:
    return readValue(node.versions[ix])
  elseif st==NFOUND:
    return null
  else:
    b ← node.children[ix]
    entry ← searchBin(b, key)
    return (entry.key==key) ? readValue(entry.version) : null

procedure helpConvert(node, ix, b):
  (K[], V[]) ← freezeAndCollect(b)
  (a,b,ε) ← fitLinearModel(K)
  newM ← MNode(K, V, (a,b,ε))
  CAS(node.children[ix], b, newM)

Versioned-value arrays maintain singly-linked stacks of (value,timestamp)(\text{value}, \text{timestamp}) nodes, with a global atomic timestamp counter for range query snapshot isolation.

5. Comparative Performance Evaluation

Kanva was evaluated against leading lock-free search data structures, specifically C-IST (lock-free interpolation search tree) and E-ABT (elimination (a,b)-tree), as well as ALEX, LFABT, and FineDex, on workloads up to 200 million keys across 128 threads of a dual-socket 32-core AMD EPYC server.

Throughput results (read-heavy, 95% search):

Threads 8 16 32 64 128
Kanva 28.4 52.1 75.2 97.8 85.0
E-ABT 3.1 5.8 9.7 17.4 15.2
C-IST 1.4 2.6 4.1 6.2 5.8
  • Under read-heavy workloads, Kanva delivers up to 100 million operations per second (Mops/sec) on 64 threads, approximately 10× greater throughput than E-ABT and 20× greater than C-IST.
  • For update-heavy workloads (30% search, 50% insert, 20% delete), Kanva achieves 60–80 Mops/sec, about 4× over E-ABT and 7× over C-IST.
  • These gains persist for skewed Zipfian, “Facebook,” “Amazon,” and “OSM” real datasets.
  • Kanva incurs 20–30% fewer LLC misses than C-IST or LFABT, benefiting from its fatter internal nodes and zero need for tree rebalancing.
  • YCSB workloads further show Kanva’s throughput exceeds E-ABT by 1.2–1.4× and C-IST by 2–3× for various read/write mixes, maintaining performance even under high “hot-spot” contention.

6. Implications and Context

Kanva’s integration of learned arithmetic search with lock-free, non-blocking updates offers both the average-case speed advantages of learned indexes and the worst-case progress guarantees of concurrent non-blocking search structures. By avoiding global locks or compare-based traversals, Kanva achieves a combination of high multi-core scalability, provable linearizability, and cache-friendly access that surpasses prior state-of-the-art approaches by up to one to two orders of magnitude on practical datasets (Bhardwaj et al., 2023). This suggests the feasibility of deploying learned methods in concurrent primitives without sacrificing correctness or performance guarantees. The approach establishes a general paradigm for non-blocking, model-based data structures capable of efficient concurrent access patterns.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Kanva Framework.