Papers
Topics
Authors
Recent
Search
2000 character limit reached

Parameterized Complexity of Categorical Clustering with Size Constraints

Published 16 Apr 2021 in cs.DS and cs.DM | (2104.07974v1)

Abstract: In the Categorical Clustering problem, we are given a set of vectors (matrix) A={a_1,\ldots,a_n} over \Sigmam, where \Sigma is a finite alphabet, and integers k and B. The task is to partition A into k clusters such that the median objective of the clustering in the Hamming norm is at most B. That is, we seek a partition {I_1,\ldots,I_k} of {1,\ldots,n} and vectors c_1,\ldots,c_k\in\Sigmam such that \sum_{i=1}k\sum_{j\in I_i}d_h(c_i,a_j)\leq B, where d_H(a,b) is the Hamming distance between vectors a and b. Fomin, Golovach, and Panolan [ICALP 2018] proved that the problem is fixed-parameter tractable (for binary case \Sigma={0,1}) by giving an algorithm that solves the problem in time 2{O(B\log B)} (mn){O(1)}. We extend this algorithmic result to a popular capacitated clustering model, where in addition the sizes of the clusters should satisfy certain constraints. More precisely, in Capacitated Clustering, in addition, we are given two non-negative integers p and q, and seek a clustering with p\leq |I_i|\leq q for all i\in{1,\ldots,k}. Our main theorem is that Capacitated Clustering is solvable in time 2{O(B\log B)}|\Sigma|B(mn){O(1)}. The theorem not only extends the previous algorithmic results to a significantly more general model, it also implies algorithms for several other variants of Categorical Clustering with constraints on cluster sizes.

Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.