SLS-INDEX: Secure Learned Spatial Index

Updated 10 December 2025

The paper introduces SLS-INDEX, a secure indexing framework combining hierarchical learned predictors with Paillier homomorphic encryption to support privacy-preserving spatial range queries.
It employs a robust two-server semi-honest model with secure bucketing, permutation-based bucket prediction, and noise padding to protect data and query access patterns.
Empirical evaluations show sub-second query latency and 2–3× faster range queries on real datasets compared to traditional ciphertext-only approaches.

The Secure Learned Spatial Index (SLS-INDEX) is a privacy-preserving indexing framework for range queries over encrypted spatial datasets. Designed to address the dual challenges of access-pattern privacy and query efficiency, SLS-INDEX integrates a hierarchical learned predictor architecture with Paillier homomorphic encryption and includes protocols for secure bucket prediction and point extraction. The construction achieves strong formal privacy guarantees under a robust two-server semi-honest adversarial model, while supporting practical sub-second query latency at scale (Wang et al., 3 Dec 2025).

1. Threat Model and Security Framework

SLS-INDEX is evaluated in the standard two-server semi-honest model. The Data Service Provider (DSP) and Data Assistance Provider (DAP) are assumed to be non-colluding and semi-honest: they may follow the protocol but attempt to glean additional information from observed data and protocol transcripts. The Data Owner (DO) and Client are fully trusted, and all inter-party channels are authenticated.

Adversarial Views:

DSP: Observes the encrypted index $I^e$ , encrypted queries $\llbracket Q\rrbracket$ , transcripts of all homomorphic computations, outputs of secure shuffles (permutations), and garbled indicator vectors.
DAP: Possesses the secret Paillier key $sk$ , receives randomized ciphertexts from DSP, and accesses decrypted intermediate values (under controlled opening semantics) and noise-perturbed buckets or points.
Neither DSP nor DAP learns underlying plaintext data, queries, result values, or true access patterns beyond what is captured by the explicitly defined leakage functions.

Leakage Functions:

Build: $\mathcal L_{\mathsf{Build}}(\llbracket P\rrbracket) = (n, d, \Psi)$ with $n$ (point count), $d$ (dimensionality), and $\Psi$ (the sorted order of Z-curve ranks).
Update: $\mathcal L_{\mathsf{Update}}(I^e, \mathsf{op}) = (\mathit{pos}, \mathit{bid}, \varpi)$ for operation position, bucket identifier, and type ( $\mathrm{insert}$ , $\mathrm{delete}$ ).
Query: $\mathcal L_{\mathsf{Query}}(I^e, \llbracket Q \rrbracket) = (\beta_{\mathrm{low}}, \beta_{\mathrm{upp}}, |R^e|)$ for the encrypted bucket scan-range and result ciphertext volume.

The security property is defined by indistinguishability between real and ideal experiments: for any PPT adversary, there exists a simulator such that distinguishing real from ideal outputs is negligible in the security parameter $\lambda$ .

2. Index Construction and System Components

SLS-INDEX organizes encrypted spatial data using a hierarchical learned index with a three-stage prediction pipeline. The structure enables efficient pruning and bucket localization in the encrypted domain.

2.1 Z-curve Mapping and Bucketing

Spatial Encoding: Each $d$ -dimensional point $\bm p$ is mapped to a 1D Z-curve value $\bm p.\mathrm{cur}$ .
Ordering: The points are sorted by $\mathrm{cur}$ , assigned order indices $\bm p.\mathrm{ord} \in \{1, \dots, n\}$ .
Bucketing: With fixed capacity $b$ , each point's bucket ID is determined as

$\bm p.\mathrm{bkt} = \left\lceil \frac{\bm p.\mathrm{ord}}{b} \right\rceil$

Storage: Each bucket contains encrypted points $E(\bm p)$ and an encrypted minimum bounding rectangle (MBR) $E(\mathit{mbr})$ .

2.2 Hierarchical Predictors

Head Level (Level 0): The space is partitioned into $2^{\lfloor \log_4(m/b) \rfloor} \times 2^{\lfloor \log_4(m/b) \rfloor}$ grid cells. An MLP ( $\mathcal M_{0,0}$ ) is trained for cell ID prediction, with parameter encryption as SMLP $_p$ (biases encrypted; weights plaintext with additive noise).
Intermediate Levels: Recursion is performed on any grid cell exceeding threshold $m$ , with new SMLP $_p$ models trained per refined partition.
Leaf Level: For partitions with $\le m$ points, $\mathcal M_{h,k}$ predicts the final bucket ID. Here, all model parameters are fully encrypted (SMLP $_c$ ). Secure ReLU ( $\mathrm{SReLU}$ ) is evaluated via secure integer comparison (SIC).

2.3 Bucket Padding

Each bucket is padded with dummy points $\llbracket 0\rrbracket$ up to the fixed capacity $b$ , obfuscating true data distribution and bucket sizes.

2.4 Paillier Homomorphic Encryption

SLS-INDEX leverages the Paillier cryptosystem for all index and data encryption—with additive homomorphic properties enabling secure arithmetic over ciphertexts:

Encryption: $E(m;r) = g^m r^n \bmod n^2$ for message $m$ and randomness $r$ .
Homomorphic Addition: $E(m_1; r_1) \cdot E(m_2; r_2) = E(m_1 + m_2; r_1 r_2) \bmod n^2$ .

3. Secure Range Query Protocols

SLS-INDEX employs two key protocols to handle range queries without leaking access patterns or sensitive structural information.

3.1 Permutation-Based Secure Bucket Prediction (SBP)

SBP allows secure index traversal from the head to leaf predictor, hiding the real search path and bucket identifiers using a combination of model encryption, shuffling, and PRF masking:

The Client sends $\llbracket Q\rrbracket$ to DSP.
DSP traverses SMLP $_p$ predictors, switching to a leaf-level SMLP $_c$ when appropriate.
DSP introduces a dummy bucket and randomization, then shuffles candidate bucket ciphertexts with a random permutation $\pi$ and sends them to DAP.
DAP decrypts, repermutes, and returns results to DSP, who then accesses, perturbs, and returns buckets' encrypted contents and query to DAP.
DAP checks for membership using plaintext, sets a flag, and sends the result back to DSP, who finalizes the relevant bucket ID.

This protocol restricts leakage to information allowed by the leakage function, ensuring that bucket selection and search path remain hidden.

3.2 Secure Point Extraction (SPE)

SPE retrieves all encrypted points in a bucket range covering a query's scan region:

DSP obtains span bucket IDs and masks them before sending to DAP.
DAP decrypts and unmasks, returning the actual bucket-range to DSP.
For each relevant bucket, DSP perturbs, shuffles, and sends obfuscated points and MBRs to DAP, who tests intersection and sets indicator vectors.
DSP inverts the shuffle, computes a denoising factor, removes perturbations, and reconstructs the set of result ciphertexts.
This process avoids reliance on heavy MPC, using authorized openings and homomorphic operations for practical performance.

4. Formal Security and Privacy Guarantees

SLS-INDEX's security is formally proven under the semantic security of Paillier encryption, PRF pseudorandomness, and oblivious shuffling:

A simulator using only values revealed by the leakage functions can produce simulated protocol transcripts indistinguishable from real executions.
Each stage—from index construction to update and query—preserves computational indistinguishability when replacing real values with random ones encrypted under fresh randomness.
The main result is that SLS-INDEX is $(\mathcal L_{Build}, \mathcal L_{Update}, \mathcal L_{Query})$ -secure in the two-server semi-honest setting, with negligible advantage for PPT adversaries.

5. Empirical Evaluation and Efficiency

Experiments for SLS-INDEX were conducted on synthetic (UNI, NOR, SKE) and real datasets (CAR, GOW) with up to $n=10^5$ two-dimensional points. Baseline comparisons included TRQED $^+$ and SRQ $_b$ .

Metric	SLS-INDEX	TRQED $^+$	SRQ $_b$
Construction Time	125 s	90 s	40 s
Storage (for $n = 10^5$ )	120 MB	(not specified)	(not specified)
Query Latency (UNI)	< 1 s	3–5 s	2–4 s
Query Latency (CAR/GOW)	2–3× faster than baselines	—	—
Recall	$\approx$ 100%	—	—

Construction time is linear in $n$ .
Storage overhead is $O(n \log n)$ ciphertexts.
Query Latency: For area=0.005%, queries complete in under 1 second for $n=10^5$ , compared with 3–5 s (TRQED $^+$ ), 2–4 s (SRQ $_b$ ). On large real datasets (CAR, GOW), SLS-INDEX achieves 2–3× faster queries.
Communication: $O(d \log n \cdot |N|)$ bits per query; accounts for 40–60% of latency.
Updates: Each insert or delete takes $O(d\log n)$ to rerandomize a leaf predictor.

A plausible implication is that the practical deployment of SLS-INDEX can result in substantial efficiency improvements for privacy-preserving spatial range queries in cloud settings, outperforming previous ciphertext-only approaches by a factor of two to three while retaining rigorous privacy properties.

6. Context, Limitations, and Significance

SLS-INDEX demonstrates that learned index structures, when combined with homomorphic encryption and secure protocol design, can offer both strong privacy guarantees and query performance suitable for interactive encrypted data analytics. By exposing only rigorously bounded leakage, SLS-INDEX prevents inference of dataset content, queries, individual results, or access patterns except for clearly defined volume and structure information.

The index accommodates efficient dynamic updates and scales with data dimensionality and dataset size.
Limitations include the cryptographic performance overhead intrinsic to public-key homomorphic encryption and the requirement of a two-server non-colluding model.
The introduction of bucket noise and secure point extraction protocols reduces information revealed via side channels.

The design principles underlying SLS-INDEX are likely to influence future research in encrypted database indexing and privacy-preserving spatial query processing (Wang et al., 3 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

Towards Privacy-Preserving Range Queries with Secure Learned Spatial Index over Encrypted Data (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Secure Learned Spatial Index (SLS-INDEX).