Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimal Las Vegas Locality Sensitive Data Structures (1704.02054v3)

Published 6 Apr 2017 in cs.DS

Abstract: We show that approximate similarity (near neighbour) search can be solved in high dimensions with performance matching state of the art (data independent) Locality Sensitive Hashing, but with a guarantee of no false negatives. Specifically, we give two data structures for common problems. For $c$-approximate near neighbour in Hamming space we get query time $dn{1/c+o(1)}$ and space $dn{1+1/c+o(1)}$ matching that of \cite{indyk1998approximate} and answering a long standing open question from~\cite{indyk2000dimensionality} and~\cite{pagh2016locality} in the affirmative. By means of a new deterministic reduction from $\ell_1$ to Hamming we also solve $\ell_1$ and $\ell_2$ with query time $d2n{1/c+o(1)}$ and space $d2 n{1+1/c+o(1)}$. For $(s_1,s_2)$-approximate Jaccard similarity we get query time $dn{\rho+o(1)}$ and space $dn{1+\rho+o(1)}$, $\rho=\log\frac{1+s_1}{2s_1}\big/\log\frac{1+s_2}{2s_2}$, when sets have equal size, matching the performance of~\cite{tobias2016}. The algorithms are based on space partitions, as with classic LSH, but we construct these using a combination of brute force, tensoring, perfect hashing and splitter functions `a la~\cite{naor1995splitters}. We also show a new dimensionality reduction lemma with 1-sided error.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Thomas Dybdahl Ahle (5 papers)
Citations (21)

Summary

We haven't generated a summary for this paper yet.