Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast entropy-bounded string dictionary look-up with mismatches (1806.09646v1)

Published 25 Jun 2018 in cs.DS

Abstract: We revisit the fundamental problem of dictionary look-up with mismatches. Given a set (dictionary) of $d$ strings of length $m$ and an integer $k$, we must preprocess it into a data structure to answer the following queries: Given a query string $Q$ of length $m$, find all strings in the dictionary that are at Hamming distance at most $k$ from $Q$. Chan and Lewenstein (CPM 2015) showed a data structure for $k = 1$ with optimal query time $O(m/w + occ)$, where $w$ is the size of a machine word and $occ$ is the size of the output. The data structure occupies $O(w d \log{1+\varepsilon} d)$ extra bits of space (beyond the entropy-bounded space required to store the dictionary strings). In this work we give a solution with similar bounds for a much wider range of values $k$. Namely, we give a data structure that has $O(m/w + \logk d + occ)$ query time and uses $O(w d \logk d)$ extra bits of space.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Paweł Gawrychowski (151 papers)
  2. Gad M. Landau (16 papers)
  3. Tatiana Starikovskaya (35 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.