Hierarchical clustering with maximum density paths and mixture models (2503.15582v2)
Abstract: Hierarchical clustering is an effective, interpretable method for analyzing structure in data. It reveals insights at multiple scales without requiring a predefined number of clusters and captures nested patterns and subtle relationships, which are often missed by flat clustering approaches. However, existing hierarchical clustering methods struggle with high-dimensional data, especially when there are no clear density gaps between modes. In this work, we introduce t-NEB, a probabilistically grounded hierarchical clustering method, which yields state-of-the-art clustering performance on naturalistic high-dimensional data. t-NEB consists of three steps: (1) density estimation via overclustering; (2) finding maximum density paths between clusters; (3) creating a hierarchical structure via bottom-up cluster merging. t-NEB uses a probabilistic parametric density model for both overclustering and cluster merging, which yields both high clustering performance and a meaningful hierarchy, making it a valuable tool for exploratory data analysis. Code is available at https://github.com/ecker-lab/tneb clustering.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.