Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimal Algorithms for Bounded Weighted Edit Distance (2305.06659v2)

Published 11 May 2023 in cs.DS

Abstract: The edit distance of two strings is the minimum number of insertions, deletions, and substitutions of characters needed to transform one string into the other. The textbook dynamic-programming algorithm computes the edit distance of two length-$n$ strings in $O(n2)$ time, which is optimal up to subpolynomial factors under SETH. An established way of circumventing this hardness is to consider the bounded setting, where the running time is parameterized by the edit distance $k$. A celebrated algorithm by Landau and Vishkin (JCSS '88) achieves time $O(n + k2)$, which is optimal as a function of $n$ and $k$. Most practical applications rely on a more general weighted edit distance, where each edit has a weight depending on its type and the involved characters from the alphabet $\Sigma$. This is formalized through a weight function $w : \Sigma\cup{\varepsilon}\times\Sigma\cup{\varepsilon}\to\mathbb{R}$ normalized so that $w(a,a)=0$ and $w(a,b)\geq 1$ for all $a,b \in \Sigma\cup{\varepsilon}$ with $a \neq b$; the goal is to find an alignment of the two strings minimizing the total weight of edits. The $O(n2)$-time algorithm supports this setting seamlessly, but only very recently, Das, Gilbert, Hajiaghayi, Kociumaka, and Saha (STOC '23) gave the first non-trivial algorithm for the bounded version, achieving time $O(n + k5)$. While this running time is linear for $k\le n{1/5}$, it is still very far from the bound $O(n+k2)$ achievable in the unweighted setting. In this paper, we essentially close this gap by showing both an improved $\tilde O(n+\sqrt{nk3})$-time algorithm and, more surprisingly, a matching lower bound: Conditioned on the All-Pairs Shortest Paths (APSP) hypothesis, our running time is optimal for $\sqrt{n}\le k\le n$ (up to subpolynomial factors). This is the first separation between the complexity of the weighted and unweighted edit distance problems.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com