Papers
Topics
Authors
Recent
2000 character limit reached

Core Stability in Non-Centroid Clustering

Updated 1 December 2025
  • The paper introduces a formal framework defining the α-core using max-loss objectives to quantify cluster robustness in non-centroid settings.
  • It demonstrates impossibility results, showing that no k-clustering can achieve the 1-core under certain conditions, and provides tight α-bound analyses.
  • The study proposes algorithmic relaxations such as Fully Justified Representation and spectral stability methods to enhance practical robustness in clustering.

Core stability in non-centroid clustering refers to the robustness of cluster assignments under potential group deviations, quantified by whether a coalition of agents can jointly improve their losses by switching to a different clustering configuration. In non-centroid clustering, loss is not determined by distance to a representative point (centroid), but by some function of pairwise distances or graph interactions among cluster members. Formal definitions, impossibility results, algorithmic frameworks, and empirical findings demonstrate the complexity and limitations of achieving core stability, especially under the max-loss objective, where cluster assignment is determined by the worst pairwise distance within a cluster.

1. Formal Framework of Core Stability in Non-Centroid Clustering

A finite metric space (N,d)(N, d) consists of nn agents with a symmetric distance function dd. A non-centroid kk-clustering defines a partition C={C1,,Ck}\mathcal C = \{C_1, \dots, C_k\}, where CjC_j \neq \emptyset and jCj=N\bigcup_j C_j = N. The max-loss objective assigns to agent iNi \in N in cluster SNS \subseteq N the loss

$\loss_i(S) = \max_{j \in S} d(i, j).$

For clustering C\mathcal C, we write $\loss_i(\mathcal C) = \loss_i(\mathcal C(i))$, where C(i)\mathcal C(i) is the cluster containing ii.

α\alpha-blocking coalition: Subset SNS \subseteq N of size Sn/k|S| \geq n/k α\alpha-blocks C\mathcal C if for every iSi \in S,

$\alpha \cdot \loss_i(S) < \loss_i(\mathcal C).$

α\alpha-core: C\mathcal C is in the α\alpha-core if there is no α\alpha-blocking coalition of size n/k\geq n/k. The $1$-core is referred to as the core (Caragiannis et al., 30 Oct 2024, Bredereck et al., 24 Nov 2025).

2. Impossibility Theorems and Core Emptiness

For k3k \geq 3 and n9n \geq 9 divisible by kk, there exist metric instances such that no kk-clustering lies in the α\alpha-core for any α<21/51.1487\alpha < 2^{1/5} \approx 1.1487. The construction, detailed in (Bredereck et al., 24 Nov 2025), uses specially structured coalitions S1,,S5S_1, \dots, S_5 achieving distinct internal max-losses:

  • $\loss_i(S_1) = 2^{1/5}$
  • $\loss_i(S_2) = 2^{2/5}$
  • $\loss_i(S_3) \in \{1,2\}$
  • $\loss_i(S_4) = 2^{4/5}$
  • $\loss_i(S_5) = 2^{3/5}$

Any clustering forces at least one such coalition to strictly improve by factor >21/5> 2^{1/5}, rendering the core empty. The bound is tight: at α=21/5\alpha = 2^{1/5}, cluster assignments can precisely meet all coalition thresholds.

A computer-assisted construction for 2D Euclidean space (k=3k=3, n=9n=9) yields a lower bound α<1.138\alpha < 1.138 where no core clustering exists. This demonstrates that the core (1-core) can be empty for non-centroid, max-loss clustering—a result previously unresolved (Bredereck et al., 24 Nov 2025).

3. Algorithmic Relaxations: FJR and Approximate Cores

Given the restrictive nature and possible emptiness of the core, Fully Justified Representation (FJR) provides a relaxation. A clustering CC is in the α\alpha-FJR if no coalition SS of size n/k\geq n/k can reduce everyone's loss below $\min_{j \in S} \loss_j(C(j))$ by a factor more than α\alpha:

$\forall i \in S,\, \alpha \cdot \loss_i(S) < \min_{j \in S} \loss_j(C(j))$

Algorithmically, the GreedyCohesiveClustering framework (using exact or approximate oracle for cohesive cluster selection) constructs clusterings satisfying FJR precisely or up to a constant factor. The efficient GreedyCapture algorithm achieves:

  • $2$-core (max-loss), $4$-FJR (average loss), running in O(kn)O(kn) time
  • Core and FJR violation metrics computed in practice reveal GreedyCapture is consistently fairer than kk-means++ or kk-medoids, sacrificing only moderate increases in standard clustering cost (Caragiannis et al., 30 Oct 2024).
Algorithm Objective Core/FJR Guarantee Runtime
GreedyCohesiveClustering Arbitrary loss Exact FJR (factor-λ\lambda) Inefficient
GreedyCapture Average/max-loss O(n/k)O(n/k)-core (average), $2$-core (max-loss) O(kn)O(kn)

4. Statistical and Probabilistic Core Notions

Beyond worst-case coalition deviations, statistical core stability quantifies the robustness of cluster membership under stochasticity. For sample-dependent clustering methods (hierarchical, density-based, spectral), core clusters are the largest subsets where every pair (i,j)(i,j) co-occurs in the same cluster with probability 1α\geq 1 - \alpha:

Pij=PrDF[fD(i)=fD(j)]1αP_{ij} = \Pr_{D' \sim F}[f_{D'}(i) = f_{D'}(j)] \geq 1 - \alpha

Estimating PijP_{ij} via bootstrapping and finding core clusters reduces to max-clique identification in a co-occurrence graph. Empirical results indicate non-centroid methods (e.g., hierarchical linkage) tend to have smaller and less pure cores than centroid-based methods, emphasizing the instability of assignments near cluster boundaries (Henelius et al., 2016).

5. Stability in Graph-Based Non-Centroid Clustering

Maximum-likelihood mixture models for graphs (e.g., the NL-EM model) enable a node-centric notion of core stability. Stabilizer nodes are those whose connection patterns strictly exclude membership in all but one group for their neighbors, making the classification "crisp." Extraction involves solving set-cover instances among neighbor excluded-connection sets. The abundance and redundancy of stabilizers corresponds to resilience of the classification under structural perturbations and noise. Real-world examples (U.S. Senate co-voting, food webs) identify stabilizers as information-rich backbones reflective of core community structure (0809.1398).

6. Spectral Stability: Core Assessment via Eigenvalue Gaps

For spectral clustering, stability is assessed via the kk-th spectral gap gk=λk+1λkg_k = \lambda_{k+1} - \lambda_k and the structured distance to ambiguity δk(W)\delta_k(W). The latter is the minimum Laplacian perturbation under admissible changes such that the kk-th gap collapses. A two-level iterative algorithm, combining constrained gradient flow (inner) and root-finding (outer), computes δk(W)\delta_k(W) robustly in sparse graphs. Structured stability indicators can yield different optimal cluster numbers compared to unstructured spectral gaps, especially for real data with ambiguous community separation (Andreotti et al., 2019).

Stability Indicator Definition Use Case
Spectral gap gkg_k λk+1λk\lambda_{k+1} - \lambda_k Rapid practical screening
Structured distance δk\delta_k Min. Frobenius norm L(W)L(W^)F\|L(W) - L(\widehat W)\|_F for vanishing gap Robustness assessment

7. Research Directions and Open Problems

  • The exact α\alpha-core often fails to exist under max-loss; $2$-core existence is currently the best general guarantee. Structured metric properties or alternative losses (sum-loss, 2\ell_2-loss) might admit stronger results but remain open.
  • Efficient auditing procedures enable estimation of FJR violation; analogous constant-factor core audits remain an open technical challenge.
  • Statistical core notions allow fine-grained stability assessments, but scale is limited by max-clique complexity and bootstrap instance size.
  • Extensions to richer and non-pairwise loss functions, adaptive choice of kk, and coalition-formation models may yield further insights into tractable core stability for non-centroid clustering (Bredereck et al., 24 Nov 2025, Caragiannis et al., 30 Oct 2024, Henelius et al., 2016, 0809.1398, Andreotti et al., 2019).

A plausible implication is that the conceptual shift from exact core stability to relaxations (FJR, statistical cores, stabilizers, structured spectral stability) provides a necessary framework for achieving practically robust, interpretable, and fair clusterings in complex metric and graph-based domains.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Core Stability in Non-Centroid Clustering.