Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sublinear-Time Algorithms for Computing & Embedding Gap Edit Distance (2007.12762v2)

Published 24 Jul 2020 in cs.DS

Abstract: In this paper, we design new sublinear-time algorithms for solving the gap edit distance problem and for embedding edit distance to Hamming distance. For the gap edit distance problem, we give an $\tilde{O}(\frac{n}{k}+k2)$-time greedy algorithm that distinguishes between length-$n$ input strings with edit distance at most $k$ and those with edit distance exceeding $(3k+5)k$. This is an improvement and a simplification upon the result of Goldenberg, Krauthgamer, and Saha [FOCS 2019], where the $k$ vs $\Theta(k2)$ gap edit distance problem is solved in $\tilde{O}(\frac{n}{k}+k3)$ time. We further generalize our result to solve the $k$ vs $k'$ gap edit distance problem in time $\tilde{O}(\frac{nk}{k'}+k2+ \frac{k2}{k'}\sqrt{nk})$, strictly improving upon the previously known bound $\tilde{O}(\frac{nk}{k'}+k3)$. Finally, we show that if the input strings do not have long highly periodic substrings, then already the $k$ vs $(1+\epsilon)k$ gap edit distance problem can be solved in sublinear time. Specifically, if the strings contain no substring of length $\ell$ with period at most $2k$, then the running time we achieve is $\tilde{O}(\frac{n}{\epsilon2 k}+k2\ell)$. We further give the first sublinear-time probabilistic embedding of edit distance to Hamming distance. For any parameter $p$, our $\tilde{O}(\frac{n}{p})$-time procedure yields an embedding with distortion $O(kp)$, where $k$ is the edit distance of the original strings. Specifically, the Hamming distance of the resultant strings is between $\frac{k-p+1}{p+1}$ and $O(k2)$ with good probability. This generalizes the linear-time embedding of Chakraborty, Goldenberg, and Kouck\'y [STOC 2016], where the resultant Hamming distance is between $\frac k2$ and $O(k2)$. Our algorithm is based on a random walk over samples, which we believe will find other applications in sublinear-time algorithms.

Citations (15)

Summary

We haven't generated a summary for this paper yet.