Disambiguating Cluster Separation in Neighbor Embeddings
Determine, for cluster separation observed in low-dimensional neighbor embeddings such as t-SNE and UMAP, whether the observed separation reflects genuine structure in the high-dimensional data manifold or instead arises from the optimization process distorting the manifold to favor local compactness. Establish criteria or mechanisms that can distinguish true manifold-driven separation from optimizer-induced artifacts.
References
When an embedding shows separated clusters, it remains unclear whether this separation reflects true data structure or arises from the optimizer distorting the underlying manifold to favor local compactness.
— A Spectral Framework for Multi-Scale Nonlinear Dimensionality Reduction
(2604.02535 - Huang et al., 2 Apr 2026) in Introduction