Scaffoldings and Spines: Organizing High-Dimensional Data Using Cover Trees, Local Principal Component Analysis, and Persistent Homology (1602.06245v2)
Abstract: We propose a flexible and multi-scale method for organizing, visualizing, and understanding datasets sampled from or near stratified spaces. The first part of the algorithm produces a cover tree using adaptive thresholds based on a combination of multi-scale local principal component analysis and topological data analysis. The resulting cover tree nodes consist of points within or near the same stratum of the stratified space. They are then connected to form a \emph{scaffolding} graph, which is then simplified and collapsed down into a \emph{spine} graph. From this latter graph the stratified structure becomes apparent. We demonstrate our technique on several synthetic point cloud examples and we use it to understand song structure in musical audio data.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.