Improved Extended Regular Expression Matching (2510.09311v1)
Abstract: An extended regular expression $R$ specifies a set of strings formed by characters from an alphabet combined with concatenation, union, intersection, complement, and star operators. Given an extended regular expression $R$ and a string $Q$, the extended regular expression matching problem is to decide if $Q$ matches any of the strings specified by $R$. Extended regular expressions are a basic concept in formal language theory and a basic primitive for searching and processing data. Extended regular expression matching was introduced by Hopcroft and Ullmann in the 1970s [\textit{Introduction to Automata Theory, Languages and Computation}, 1979], who gave a simple dynamic programming solution using $O(n3m)$ time and $O(n2m)$ space, where $n$ is the length of $Q$ and $m$ is the length of $R$. Since then, several solutions have been proposed, but few significant asymptotic improvements have been obtained. The current state-of-the art solution, by Yamamoto and Miyazaki~[COCOON, 2003], uses $O(\frac{n3k + n2m}{w} + n + m)$ time and $O(\frac{n2k + nm}{w} + n + m)$ space, where $k$ is the number of negation and complement operators in $R$ and $w$ is the number of bits in a word. This roughly replaces the $m$ factor with $k$ in the dominant terms of both the space and time bounds of the Hopcroft and Ullmann algorithm. We revisit the problem and present a new solution that significantly improves the previous time and space bounds. Our main result is a new algorithm that solves extended regular expression matching in [O\left(n\omega k + \frac{n2m}{\min(w/\log w, \log n)} + m\right)] time and $O(\frac{n2 \log k}{w} + n + m) = O(n2 +m)$ space, where $\omega \approx 2.3716$ is the exponent of matrix multiplication. Essentially, this replaces the dominant $n3k$ term with $n\omega k$ in the time bound, while simultaneously improving the $n2k$ term in the space to $O(n2)$.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.