Papers
Topics
Authors
Recent
2000 character limit reached

Explainable Clustering Techniques

Updated 26 January 2026
  • Clustering-based explainability techniques are methods that enhance the interpretability of clustering algorithms by providing structured, formal explanations for data groupings.
  • They employ a range of strategies including axis-aligned decision trees, prototype selection, and counterfactual analyses to reveal both global and local data patterns.
  • These techniques balance performance and interpretability through quantified trade-offs and approximation guarantees, enabling practical insights in complex domains.

Clustering-based explainability techniques encompass a diverse family of methods aimed at making the outcomes and internal logic of clustering algorithms more interpretable. These techniques provide formal or structured explanations for why data points have been grouped together, what differentiates clusters, and how global and local structures in the data relate to input features, tags, or representations. The field spans interpretable-by-design clustering algorithms, post hoc rule extraction, attribution- and prototype-based approaches, and explainability for clustering within black-box or deep models.

1. Formal Models for Explainable Clustering

A central model for explainable clustering is the axis-aligned decision tree partition: a kk-clustering is explainable if it derives from a binary "threshold tree" where each internal node splits data via a single-feature threshold, and each of the kk leaves induces a cluster. For XRdX\subset\mathbb{R}^d, a threshold tree TT is defined with internal nodes representing splits of the form xiθx_i\le\theta and each path to a leaf forming an axis-aligned conjunction of threshold cuts. This ensures every cluster is expressible as a simple rule over input features. The cost of an explainable clustering is compared to unrestricted optimal clusterings using objectives such as kk-medians (ℓ₁), kk-means (ℓ₂²), or generalized p\ell_p power objectives:

costp(C1,,Ck)=j=1kminμRdxCjxμpp\text{cost}_p(C_1,\dots,C_k) = \sum_{j=1}^k \min_{\mu\in\mathbb{R}^d} \sum_{x\in C_j} \|x-\mu\|_p^p

The "price of explainability" is the worst-case cost ratio between the best explainable and best unconstrained clusterings for a chosen objective. This quantifies how much optimality is traded for interpretability (Gamlath et al., 2021, Gupta et al., 2023, Laber et al., 2021).

2. Algorithmic Techniques and Approximation Guarantees

Several structural approaches have been developed for explainable clustering:

Oblivious Random-Cut Algorithms: Starting with kk reference centers UU, the algorithm recursively samples random axis-aligned cuts until each cluster is separated in the tree. For kk-medians, this randomized, oblivious approach achieves an expected O(log2k)O(\log^2 k) approximation to optimum, and for kk-means, O(klog2k)O(k\log^2 k). For p\ell_p-objectives, the guarantee is O(kp1log2k)O(k^{p-1}\log^2 k). Notably, the algorithm's time complexity is O(dklog2k)O(dk\log^2 k), independent of the number of datapoints (Gamlath et al., 2021).

Random Thresholds: This variant replaces centers by repeatedly drawing random axis-parallel cuts that separate remaining centers in a leaf and achieves a tight price of explainability 1+Hk1=O(logk)1+H_{k-1}=O(\log k) for kk-medians. Matching lower bounds constructed via hitting-set and grid arguments establish that no axis-aligned threshold-tree clustering can achieve a better worst-case approximation. For kk-means, a refined "bulk vs. solo cut" analysis reduces the price of explainability from O(klnk)O(k\ln k) to O(klnlnk)O(k\ln\ln k) (Gupta et al., 2023).

Lower Bounds and Hardness Results: For kk-medians, any explainable tree must incur a cost at least Ω(logk)\Omega(\log k) times optimum; for kk-means, Ω(k)\Omega(k) is necessary in the worst case. It is NP-hard to find explainable kk-median or kk-means clusterings with approximation better than these ratios (Gupta et al., 2023, Laber et al., 2021). These results show the fundamental trade-off between interpretability (axis-aligned tree explanations) and approximation quality.

3. Explainability in Deep and Representation Learning

Explainability techniques have been extended to the field of deep learning and complex data:

Deep Descriptive Clustering (DDC): DDC jointly learns sub-symbolic (deep feature) representations and generates cluster explanations using interpretable symbolic tags. The method maximizes mutual information between input and cluster assignments for high-quality clustering, and solves an integer linear program to select concise, orthogonal tag sets that explain each cluster. This yields improved normalized mutual information (NMI) and accuracy over baselines, as well as high-quality cluster-level explanations (e.g., Tag Coverage TC≈1.00, ITF≈2.32) (Zhang et al., 2021).

Clustering Activations in CNNs: Clustering intermediate activation vectors (e.g., via non-negative matrix factorization) in CNNs for digital pathology yields clusters that correspond to semantically meaningful morphological patterns, offering finer explanations than saliency map methods. Cluster assignments can be visualized globally on whole-slide images and provide interpretable breakdowns that align with pathological subtypes (Bajger et al., 18 Nov 2025).

Neuralization Approaches: Standard clustering algorithms (e.g., kk-means, kernel kk-means, GMMs) can be "neuralized" as feed-forward networks. Feature attributions (via Layer-wise Relevance Propagation or Integrated Gradients) can then explain cluster assignments in terms of input features or pixels, enabling per-cluster or per-point explanations in image and textual data (Kauffmann et al., 2019).

4. Rule-Based and Descriptor-Oriented Explanations

Cluster Descriptors via Tags or Patterns: In tasks where objects are associated with tags or relational features, explainable clustering may focus on selecting a compact, discriminative set of tags per cluster (descriptors). The minimum disjoint tag descriptor problem (and its relaxations) are NP-hard; practical approaches involve quadratic unconstrained binary optimization (QUBO) formulations, augmented with regularizers favoring cluster-specific tags (tag modularity). These methods have demonstrated effectiveness and scalability when deployed on specialized hardware for both social and biomedical datasets (Liu et al., 2022).

Rule Extraction via Frequent-Itemset Mining: Explanations can be formulated as conjunctions of predicates (rules) that maximize the coverage of a cluster while minimizing leakage into others. The problem maps to generalized frequent-itemset mining (gFIM) with support and conciseness thresholds, leveraging attribute selection and interval taxonomies for scalability. The Cluster-Explorer tool achieves superior rule quality and efficiency compared to decision-tree or post-hoc XAI baselines (Ofek et al., 2024).

Pattern Mining and Declarative Constraints: Explainability-driven clustering can also be formulated as a combinatorial selection of clusters and their associated pattern-based explanations under declarative (e.g., constraint programming) frameworks. Coverage, discrimination, and prior knowledge constraints can be encoded at cluster or pattern levels, and optimized for multiple objectives (coverage, conciseness, discrimination, clustering quality) (Guilbert et al., 2024).

5. Attribution and Prototype-Based Explainability Methods

Permutation- and Perturbation-Based Feature Importance: G2PC (Global Permutation Percent Change) provides global feature importance by measuring the percent of cluster-label flips when feature (or domain) values are permuted; L2PC (Local Perturbation Percent Change) gives per-instance, per-feature importance by locally resampling feature values. Both are algorithm-agnostic and applicable across different clustering approaches or data modalities (Ellis et al., 2021).

Exemplar-Based Explanations: Clusters can be explained by a small subset of representative points (exemplars), selected so that each cluster member is close to at least one exemplar. Although finding the smallest such set is NP-hard, practical greedy algorithms offer approximation guarantees (e.g., O(logn)O(\log n) to the minimum). This supports interpretability in domains with deep or non-interpretable embeddings, text, or images, providing concrete, human-understandable cluster representatives (Davidson et al., 2022).

Shapley-Based Attributions in Clustering: SHAP (Shapley Additive Explanations) values, computed in a supervised or semi-supervised setting, are used to obtain per-feature attributions for each instance. Clustering in the induced Shapley-attribution space (e.g., using UMAP and HDBSCAN) can recover semantically meaningful subgroups, particularly in diagnosis and prognosis applications, with subsequent rule extraction in raw feature space. Evaluation metrics include cluster quality, rule coverage and precision, and relevance for fault diagnosis (Cohen et al., 2023).

6. Counterfactual and Model-Agnostic Approaches

Counterfactual Explanations for Clustering: Counterfactual methods generate hypothetical points that would move an instance from its assigned cluster to a specified target cluster, maximizing similarity (Gower distance, feature overlap) and a "soft-score" that quantifies proximity or membership probability with respect to the target cluster. The methodology is model-agnostic and accommodates both centroid-based and density-based clustering, significantly increasing the percentage of instances for which actionable counterfactuals can be found without compromising input sparsity or computation time (Spagnol et al., 2024).

Supervised Post-Hoc Attributions for Clusters: Training supervised models (e.g., neural networks, logistic regression) to predict cluster assignments allows the application of classical feature-importance tests (e.g., Single Feature Introduction Test, SFIT), providing statistical confidence and median-based importance measures for features characterizing each cluster. This two-step approach is flexible and robust for regulatory or compliance contexts (Horel et al., 2019).


In summary, clustering-based explainability techniques comprise a multifaceted toolkit, ranging from decision-tree-induced clusterings with formal approximation guarantees (Gamlath et al., 2021, Gupta et al., 2023), to tag-based pattern descriptors (Liu et al., 2022), to deep and black-box model attribution (Zhang et al., 2021, Kauffmann et al., 2019), counterfactual (Spagnol et al., 2024), rule-mining (Ofek et al., 2024), and prototype/exemplar strategies (Davidson et al., 2022). The theoretical results tightly characterize the loss incurred for axis-aligned explainable clusterings, while algorithmic innovations enable practical interpretable clustering at scale and in complex domains. A persistent theme is the trade-off between interpretability—often formalized via axis-aligned trees, tags, or patterns—and clustering performance, with worst-case gaps that are near-optimal up to logarithmic factors in kk for kk-medians and linear in kk for kk-means.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Clustering-Based Explainability Techniques.