Interesting Paths in the Mapper (1712.10197v2)

Published 29 Dec 2017 in cs.CG, cs.DS, and math.AT

Abstract: The Mapper produces a compact summary of high dimensional data as a simplicial complex. We study the problem of quantifying the interestingness of subpopulations in a Mapper, which appear as long paths, flares, or loops. First, we create a weighted directed graph G using the 1-skeleton of the Mapper. We use the average values at the vertices of a target function to direct edges (from low to high). The difference between the average values at vertices (high-low) is set as the edge's weight. Covariation of the remaining h functions (independent variables) is captured by a h-bit binary signature assigned to the edge. An interesting path in G is a directed path whose edges all have the same signature. We define the interestingness score of such a path as a sum of its edge weights multiplied by a nonlinear function of their ranks in the path. Second, we study three optimization problems on this graph G. In the problem Max-IP, we seek an interesting path in G with the maximum interestingness score. We show that Max-IP is NP-complete. For the special case when G is a directed acyclic graph (DAG), we show that Max-IP can be solved in polynomial time - in O(mnd_i) where d_i is the maximum indegree of a vertex in G. In the more general problem IP, the goal is to find a collection of edge-disjoint interesting paths such that the overall sum of their interestingness scores is maximized. We also study a variant of IP termed k-IP, where the goal is to identify a collection of edge-disjoint interesting paths each with k edges, and their total interestingness score is maximized. While k-IP can be solved in polynomial time for k <= 2, we show k-IP is NP-complete for k >= 3 even when G is a DAG. We develop polynomial time heuristics for IP and k-IP on DAGs.

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

A distribution-guided Mapper algorithm (2024)
Algorithms and hardness results for happy coloring problems (2017)
Rainbow Colouring of Split Graphs (2014)
Isometric path complexity of graphs (2022)
Approximability Distance in the Space of H-Colourability Problems (2008)