Capacitated Fair-Range Clustering: Hardness and Approximation Algorithms (2505.15905v1)
Abstract: Capacitated fair-range $k$-clustering generalizes classical $k$-clustering by incorporating both capacity constraints and demographic fairness. In this setting, each facility has a capacity limit and may belong to one or more demographic groups. The task is to select $k$ facilities as centers and assign each client to a center such that: ($a$) no center exceeds its capacity, ($b$) the number of centers selected from each group lies within specified lower and upper bounds (fair-range constraints), and ($c$) the clustering cost (e.g., $k$-median or $k$-means) is minimized. Prior work by Thejaswi et al. (KDD 2022) showed that satisfying fair-range constraints is NP-hard, making the problem inapproximable to any polynomial factor. We strengthen this result by showing that inapproximability persists even when the fair-range constraints are trivially satisfiable, highlighting the intrinsic computational complexity of the clustering task itself. Assuming standard complexity conjectures, we show that no non-trivial approximation is possible without exhaustively enumerating all $k$-subsets of the facility set. Notably, our inapproximability results hold even on tree metrics and when the number of groups is logarithmic in the size of the facility set. In light of these strong inapproximability results, we focus on a more practical setting where the number of groups is constant. In this regime, we design two approximation algorithms: ($i$) a polynomial-time $O(\log k)$- and $O(\log2 k)$-approximation algorithm for the $k$-median and $k$-means objectives, and ($ii$) a fixed-parameter tractable algorithm parameterized by $k$, achieving $(3+\epsilon)$- and $(9 + \epsilon)$-approximation, respectively. These results match the best-known approximation guarantees for capacitated clustering without fair-range constraints and resolves an open question posed by Zang et al. (NeurIPS 2024).