On Computing a Center Persistence Diagram (1910.01753v2)
Abstract: Throughout this paper, a persistence diagram ${\cal P}$ is composed of a set $P$ of planar points (each corresponding to a topological feature) above the line $Y=X$, as well as the line $Y=X$ itself, i.e., ${\cal P}=P\cup{(x,y)|y=x}$. Given a set of persistence diagrams ${\cal P}1,...,{\cal P}_m$, for the data reduction purpose, one way to summarize their topological features is to compute the {\em center} ${\cal C}$ of them first under the bottleneck distance. We consider two discrete versions and one continuous version. For technical reasons, we first focus on the case when $|P_i|$'s are all the same (i.e., all have the same size $n$), and the problem is to compute a center point set $C$ under the bottleneck matching distance. We show, by a non-trivial reduction from the Planar 3D-Matching problem, that this problem is NP-hard even when $m=3$ diagrams are given. This implies that the general center problem for persistence diagrams under the bottleneck distance, when $P_i$'s possibly have different sizes, is also NP-hard when $m\geq 3$. On the positive side, we show that this problem is polynomially solvable when $m=2$ and admits a factor-2 approximation for $m\geq 3$. These positive results hold for any $L_p$ metric when $P_i$'s are point sets of the same size, and also hold for the case when $P_i$'s have different sizes in the $L\infty$ metric (i.e., for the Center Persistence Diagram problem). This is the best possible in polynomial time for the Center Persistence Diagram under the bottleneck distance unless P = NP. All these results hold for both of the discrete versions as well as the continuous version; in fact, the NP-hardness and approximation results also hold under the Wasserstein distance for the continuous version.