Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Data-Structure for Approximate Longest Common Subsequence of A Set of Strings (2008.01768v3)

Published 4 Aug 2020 in cs.DS

Abstract: Given a set of $k$ strings $I$, their longest common subsequence (LCS) is the string with the maximum length that is a subset of all the strings in $I$. A data-structure for this problem preprocesses $I$ into a data-structure such that the LCS of a set of query strings $Q$ with the strings of $I$ can be computed faster. Since the problem is NP-hard for arbitrary $k$, we allow an error that allows some characters to be replaced by other characters. We define the approximation version of the problem with an extra input $m$, which is the length of the regular expression (regex) that describes the input, and the approximation factor is the logarithm of the number of possibilities in the regex returned by the algorithm, divided by the logarithm regex with the minimum number of possibilities. Then, we use a tree data-structure to achieve sublinear-time LCS queries. We also explain how the idea can be extended to the longest increasing subsequence (LIS) problem.

Summary

We haven't generated a summary for this paper yet.