Periodicity in Data Streams with Wildcards (1802.07375v3)
Abstract: We investigate the problem of detecting periodic trends within a string $S$ of length $n$, arriving in the streaming model, containing at most $k$ wildcard characters, where $k=o(n)$. A wildcard character is a special character that can be assigned any other character. We say $S$ has wildcard-period $p$ if there exists an assignment to each of the wildcard characters so that in the resulting stream the length $n-p$ prefix equals the length $n-p$ suffix. We present a two-pass streaming algorithm that computes wildcard-periods of $S$ using $\mathcal{O}(k3\,\mathsf{polylog}\,n)$ bits of space, while we also show that this problem cannot be solved in sublinear space in one pass. We then give a one-pass randomized streaming algorithm that computes all wildcard-periods $p$ of $S$ with $p<\frac{n}{2}$ and no wildcard characters appearing in the last $p$ symbols of $S$, using $\mathcal{O}(k3\mathsf{polylog}\, n)$ space.