Curve Simplification and Clustering under Fréchet Distance (2207.07809v3)
Abstract: We present new approximation results on curve simplification and clustering under Fr\'echet distance. Let $T = {\tau_i : i \in [n] }$ be polygonal curves in $Rd$ of $m$ vertices each. Let $l$ be any integer from $[m]$. We study a generalized curve simplification problem: given error bounds $\delta_i > 0$ for $i \in [n]$, find a curve $\sigma$ of at most $l$ vertices such that $d_F(\sigma,\tau_i) \le \delta_i$ for $i \in [n]$. We present an algorithm that returns a null output or a curve $\sigma$ of at most $l$ vertices such that $d_F(\sigma,\tau_i) \le \delta_i + \epsilon\delta_{\max}$ for $i \in [n]$, where $\delta_{\max} = \max_{i \in [n]} \delta_i$. If the output is null, there is no curve of at most $l$ vertices within a Fr\'echet distance of $\delta_i$ from $\tau_i$ for $i \in [n]$. The running time is $\tilde{O}\bigl(n{O(l)} m{O(l2)} (dl/\epsilon){O(dl)}\bigr)$. This algorithm yields the first polynomial-time bicriteria approximation scheme to simplify a curve $\tau$ to another curve $\sigma$, where the vertices of $\sigma$ can be anywhere in $Rd$, so that $d_F(\sigma,\tau) \le (1+\epsilon)\delta$ and $|\sigma| \le (1+\alpha) \min{|c| : d_F(c,\tau) \le \delta}$ for any given $\delta > 0$ and any fixed $\alpha, \epsilon \in (0,1)$. The running time is $\tilde{O}\bigl(m{O(1/\alpha)} (d/(\alpha\epsilon)){O(d/\alpha)}\bigr)$. By combining our technique with some previous results in the literature, we obtain an approximation algorithm for $(k,l)$-median clustering. Given $T$, it computes a set $\Sigma$ of $k$ curves, each of $l$ vertices, such that $\sum_{i \in [n]} \min_{\sigma \in \Sigma} d_F(\sigma,\tau_i)$ is within a factor $1+\epsilon$ of the optimum with probability at least $1-\mu$ for any given $\mu, \epsilon \in (0,1)$. The running time is $\tilde{O}\bigl(n m{O(kl2)} \mu{-O(kl)} (dkl/\epsilon){O((dkl/\epsilon)\log(1/\mu))}\bigr)$.