Explainable Clustering Techniques

Updated 26 January 2026

Clustering-based explainability techniques are methods that enhance the interpretability of clustering algorithms by providing structured, formal explanations for data groupings.
They employ a range of strategies including axis-aligned decision trees, prototype selection, and counterfactual analyses to reveal both global and local data patterns.
These techniques balance performance and interpretability through quantified trade-offs and approximation guarantees, enabling practical insights in complex domains.

Clustering-based explainability techniques encompass a diverse family of methods aimed at making the outcomes and internal logic of clustering algorithms more interpretable. These techniques provide formal or structured explanations for why data points have been grouped together, what differentiates clusters, and how global and local structures in the data relate to input features, tags, or representations. The field spans interpretable-by-design clustering algorithms, post hoc rule extraction, attribution- and prototype-based approaches, and explainability for clustering within black-box or deep models.

1. Formal Models for Explainable Clustering

A central model for explainable clustering is the axis-aligned decision tree partition: a $k$ -clustering is explainable if it derives from a binary "threshold tree" where each internal node splits data via a single-feature threshold, and each of the $k$ leaves induces a cluster. For $X\subset\mathbb{R}^d$ , a threshold tree $T$ is defined with internal nodes representing splits of the form $x_i\le\theta$ and each path to a leaf forming an axis-aligned conjunction of threshold cuts. This ensures every cluster is expressible as a simple rule over input features. The cost of an explainable clustering is compared to unrestricted optimal clusterings using objectives such as $k$ -medians (ℓ₁), $k$ -means (ℓ₂²), or generalized $\ell_p$ power objectives:

$\text{cost}_p(C_1,\dots,C_k) = \sum_{j=1}^k \min_{\mu\in\mathbb{R}^d} \sum_{x\in C_j} \|x-\mu\|_p^p$

The "price of explainability" is the worst-case cost ratio between the best explainable and best unconstrained clusterings for a chosen objective. This quantifies how much optimality is traded for interpretability (Gamlath et al., 2021, Gupta et al., 2023, Laber et al., 2021).

2. Algorithmic Techniques and Approximation Guarantees

Several structural approaches have been developed for explainable clustering:

Oblivious Random-Cut Algorithms: Starting with $k$ reference centers $U$ , the algorithm recursively samples random axis-aligned cuts until each cluster is separated in the tree. For $k$ -medians, this randomized, oblivious approach achieves an expected $O(\log^2 k)$ approximation to optimum, and for $k$ -means, $O(k\log^2 k)$ . For $\ell_p$ -objectives, the guarantee is $O(k^{p-1}\log^2 k)$ . Notably, the algorithm's time complexity is $O(dk\log^2 k)$ , independent of the number of datapoints (Gamlath et al., 2021).

Random Thresholds: This variant replaces centers by repeatedly drawing random axis-parallel cuts that separate remaining centers in a leaf and achieves a tight price of explainability $1+H_{k-1}=O(\log k)$ for $k$ -medians. Matching lower bounds constructed via hitting-set and grid arguments establish that no axis-aligned threshold-tree clustering can achieve a better worst-case approximation. For $k$ -means, a refined "bulk vs. solo cut" analysis reduces the price of explainability from $O(k\ln k)$ to $O(k\ln\ln k)$ (Gupta et al., 2023).

Lower Bounds and Hardness Results: For $k$ -medians, any explainable tree must incur a cost at least $\Omega(\log k)$ times optimum; for $k$ -means, $\Omega(k)$ is necessary in the worst case. It is NP-hard to find explainable $k$ -median or $k$ -means clusterings with approximation better than these ratios (Gupta et al., 2023, Laber et al., 2021). These results show the fundamental trade-off between interpretability (axis-aligned tree explanations) and approximation quality.

3. Explainability in Deep and Representation Learning

Explainability techniques have been extended to the field of deep learning and complex data:

Deep Descriptive Clustering (DDC): DDC jointly learns sub-symbolic (deep feature) representations and generates cluster explanations using interpretable symbolic tags. The method maximizes mutual information between input and cluster assignments for high-quality clustering, and solves an integer linear program to select concise, orthogonal tag sets that explain each cluster. This yields improved normalized mutual information (NMI) and accuracy over baselines, as well as high-quality cluster-level explanations (e.g., Tag Coverage TC≈1.00, ITF≈2.32) (Zhang et al., 2021).

Clustering Activations in CNNs: Clustering intermediate activation vectors (e.g., via non-negative matrix factorization) in CNNs for digital pathology yields clusters that correspond to semantically meaningful morphological patterns, offering finer explanations than saliency map methods. Cluster assignments can be visualized globally on whole-slide images and provide interpretable breakdowns that align with pathological subtypes (Bajger et al., 18 Nov 2025).

Neuralization Approaches: Standard clustering algorithms (e.g., $k$ -means, kernel $k$ -means, GMMs) can be "neuralized" as feed-forward networks. Feature attributions (via Layer-wise Relevance Propagation or Integrated Gradients) can then explain cluster assignments in terms of input features or pixels, enabling per-cluster or per-point explanations in image and textual data (Kauffmann et al., 2019).

4. Rule-Based and Descriptor-Oriented Explanations

Cluster Descriptors via Tags or Patterns: In tasks where objects are associated with tags or relational features, explainable clustering may focus on selecting a compact, discriminative set of tags per cluster (descriptors). The minimum disjoint tag descriptor problem (and its relaxations) are NP-hard; practical approaches involve quadratic unconstrained binary optimization (QUBO) formulations, augmented with regularizers favoring cluster-specific tags (tag modularity). These methods have demonstrated effectiveness and scalability when deployed on specialized hardware for both social and biomedical datasets (Liu et al., 2022).

Rule Extraction via Frequent-Itemset Mining: Explanations can be formulated as conjunctions of predicates (rules) that maximize the coverage of a cluster while minimizing leakage into others. The problem maps to generalized frequent-itemset mining (gFIM) with support and conciseness thresholds, leveraging attribute selection and interval taxonomies for scalability. The Cluster-Explorer tool achieves superior rule quality and efficiency compared to decision-tree or post-hoc XAI baselines (Ofek et al., 2024).

Pattern Mining and Declarative Constraints: Explainability-driven clustering can also be formulated as a combinatorial selection of clusters and their associated pattern-based explanations under declarative (e.g., constraint programming) frameworks. Coverage, discrimination, and prior knowledge constraints can be encoded at cluster or pattern levels, and optimized for multiple objectives (coverage, conciseness, discrimination, clustering quality) (Guilbert et al., 2024).

5. Attribution and Prototype-Based Explainability Methods

Permutation- and Perturbation-Based Feature Importance: G2PC (Global Permutation Percent Change) provides global feature importance by measuring the percent of cluster-label flips when feature (or domain) values are permuted; L2PC (Local Perturbation Percent Change) gives per-instance, per-feature importance by locally resampling feature values. Both are algorithm-agnostic and applicable across different clustering approaches or data modalities (Ellis et al., 2021).

Exemplar-Based Explanations: Clusters can be explained by a small subset of representative points (exemplars), selected so that each cluster member is close to at least one exemplar. Although finding the smallest such set is NP-hard, practical greedy algorithms offer approximation guarantees (e.g., $O(\log n)$ to the minimum). This supports interpretability in domains with deep or non-interpretable embeddings, text, or images, providing concrete, human-understandable cluster representatives (Davidson et al., 2022).

Shapley-Based Attributions in Clustering: SHAP (Shapley Additive Explanations) values, computed in a supervised or semi-supervised setting, are used to obtain per-feature attributions for each instance. Clustering in the induced Shapley-attribution space (e.g., using UMAP and HDBSCAN) can recover semantically meaningful subgroups, particularly in diagnosis and prognosis applications, with subsequent rule extraction in raw feature space. Evaluation metrics include cluster quality, rule coverage and precision, and relevance for fault diagnosis (Cohen et al., 2023).

6. Counterfactual and Model-Agnostic Approaches

Counterfactual Explanations for Clustering: Counterfactual methods generate hypothetical points that would move an instance from its assigned cluster to a specified target cluster, maximizing similarity (Gower distance, feature overlap) and a "soft-score" that quantifies proximity or membership probability with respect to the target cluster. The methodology is model-agnostic and accommodates both centroid-based and density-based clustering, significantly increasing the percentage of instances for which actionable counterfactuals can be found without compromising input sparsity or computation time (Spagnol et al., 2024).

Supervised Post-Hoc Attributions for Clusters: Training supervised models (e.g., neural networks, logistic regression) to predict cluster assignments allows the application of classical feature-importance tests (e.g., Single Feature Introduction Test, SFIT), providing statistical confidence and median-based importance measures for features characterizing each cluster. This two-step approach is flexible and robust for regulatory or compliance contexts (Horel et al., 2019).

In summary, clustering-based explainability techniques comprise a multifaceted toolkit, ranging from decision-tree-induced clusterings with formal approximation guarantees (Gamlath et al., 2021, Gupta et al., 2023), to tag-based pattern descriptors (Liu et al., 2022), to deep and black-box model attribution (Zhang et al., 2021, Kauffmann et al., 2019), counterfactual (Spagnol et al., 2024), rule-mining (Ofek et al., 2024), and prototype/exemplar strategies (Davidson et al., 2022). The theoretical results tightly characterize the loss incurred for axis-aligned explainable clusterings, while algorithmic innovations enable practical interpretable clustering at scale and in complex domains. A persistent theme is the trade-off between interpretability—often formalized via axis-aligned trees, tags, or patterns—and clustering performance, with worst-case gaps that are near-optimal up to logarithmic factors in $k$ for $k$ -medians and linear in $k$ for $k$ -means.

Markdown Upgrade to Chat

References (14)

Nearly-Tight and Oblivious Algorithms for Explainable Clustering (2021)

The Price of Explainability for Clustering (2023)

On the price of explainability for some clustering problems (2021)

Deep Descriptive Clustering (2021)

Explaining Digital Pathology Models via Clustering Activations (2025)

From Clustering to Cluster Explanations via Neural Networks (2019)

Towards Practical Explainability with Cluster Descriptors (2022)

Explaining Black-Box Clustering Pipelines With Cluster-Explorer (2024)

Towards Explainable Clustering: A Constrained Declarative based Approach (2024)

10.

Algorithm-Agnostic Explainability for Unsupervised Clustering (2021)

11.

Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms (2022)

12.

Shapley-based Explainable AI for Clustering Applications in Fault Diagnosis and Prognosis (2023)

13.

Counterfactual Explanations for Clustering Models (2024)

14.

Explainable Clustering and Application to Wealth Management Compliance (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Clustering-Based Explainability Techniques.