Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Approximation Algorithms for Fair Range Clustering (2306.06778v2)

Published 11 Jun 2023 in cs.LG, cs.AI, and cs.DS

Abstract: This paper studies the fair range clustering problem in which the data points are from different demographic groups and the goal is to pick $k$ centers with the minimum clustering cost such that each group is at least minimally represented in the centers set and no group dominates the centers set. More precisely, given a set of $n$ points in a metric space $(P,d)$ where each point belongs to one of the $\ell$ different demographics (i.e., $P = P_1 \uplus P_2 \uplus \cdots \uplus P_\ell$) and a set of $\ell$ intervals $[\alpha_1, \beta_1], \cdots, [\alpha_\ell, \beta_\ell]$ on desired number of centers from each group, the goal is to pick a set of $k$ centers $C$ with minimum $\ell_p$-clustering cost (i.e., $(\sum_{v\in P} d(v,C)p){1/p}$) such that for each group $i\in \ell$, $|C\cap P_i| \in [\alpha_i, \beta_i]$. In particular, the fair range $\ell_p$-clustering captures fair range $k$-center, $k$-median and $k$-means as its special cases. In this work, we provide efficient constant factor approximation algorithms for fair range $\ell_p$-clustering for all values of $p\in [1,\infty)$.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Fair clustering via equitable group representations. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAccT), page 504–514, 2021.
  2. Individual preference stability for clustering. In International Conference on Machine Learning (ICML), 2022.
  3. Clustering without over-representation. In Proceedings of the SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 267–275, 2019.
  4. Fair near neighbor search: Independent range sampling in high dimensions. In Proceedings of the SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 191–204, 2020.
  5. Sampling a near neighbor in high dimensions–who is the fairest of them all? arXiv preprint arXiv:2101.10905, 2021.
  6. Scalable fair clustering. In Proceedings of the International Conference on Machine Learning (ICML), pages 405–413, 2019.
  7. Fair algorithms for clustering. In Advances in Neural Information Processing Systems (NeurIPS), pages 4955–4966, 2019.
  8. On the cost of essentially fair clusterings. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 2019.
  9. A pairwise fair and community-preserving approach to k𝑘kitalic_k-center clustering. In International Conference on Machine Learning (ICML), pages 1178–1189, 2020.
  10. Fairness, semi-supervised learning, and more: A general framework for clustering with stochastic pairwise constraints. In Proc. Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), 2021.
  11. M. Charikar and S. Li. A dependent lp-rounding approach for the k-median problem. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 194–205, 2012.
  12. A constant-factor approximation algorithm for the k𝑘kitalic_k-median problem. Journal of Computer and System Sciences, 65(1):129–149, 2002.
  13. Matroid and knapsack center problems. Algorithmica, 75(1):27–52, 2016.
  14. Proportionally fair clustering. In International Conference on Machine Learning (ICML), pages 1032–1041, 2019.
  15. Fair clustering through fairlets. In Advances in Neural Information Processing Systems (NeurIPS), pages 5036–5044, 2017.
  16. How to solve fair k𝑘kitalic_k-center in massive data models. In Proceedings of the International Conference on Machine Learning (ICML), pages 1877–1886, 2020.
  17. Approximating fair clustering with cascaded norm objectives. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2664–2683, 2022.
  18. Fair representation clustering with several protected classes. In Conference on Fairness, Accountability, and Transparency (FAccT), pages 814–823, 2022.
  19. Fairness in streaming submodular maximization: Algorithms and hardness. Advances in Neural Information Processing Systems (NeurIPS), 33:13609–13622, 2020.
  20. Y. Feng and C. Shah. Has CEO gender bias really been fixed? adversarial attacking and improving gender fairness in image search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11882–11890, 2022.
  21. Socially fair k𝑘kitalic_k-means clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT), pages 438–448, 2021.
  22. Constant-factor approximation algorithms for socially fair k𝑘kitalic_k-clustering. arXiv preprint arXiv:2206.11210, 2022.
  23. A. Ghouila-Houri. Caractérisation des matrices totalement unimodulaires. Comptes Redus Hebdomadaires des Séances de l’Académie des Sciences (Paris), 254:1192–1194, 1962.
  24. Y. Girdhar and G. Dudek. Efficient on-line data summarization using extremum summaries. In International Conference on Robotics and Automation, pages 3490–3496, 2012.
  25. T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical computer science, 38:293–306, 1985.
  26. Budgeted red-blue median and its generalizations. In Proceedings of the European Symposium on Algorithms (ESA), pages 314–325, 2010.
  27. S. Har-Peled and S. Mahabadi. Near neighbor: Who is the fairest of them all? Advances in Neural Information Processing Systems (NeurIPS), 32, 2019.
  28. An improved cutting plane method for convex optimization, convex-concave games, and its applications. In Symposium on Theory of Computing (STOC), pages 944–953, 2020.
  29. Fair k𝑘kitalic_k-centers via maximum matching. In Proceedings of the International Conference on Machine Learning (ICML), pages 4940–4949, 2020.
  30. A center in your neighborhood: Fairness in facility location. In Proceedings of the Symposium on Foundations of Responsible Computing (FORC), page 5:1–5:15, 2020.
  31. Unequal representation and gender stereotypes in image search results for occupations. In Conference on Human Factors in Computing Systems (CHI), pages 3819–3828, 2015.
  32. Fair k𝑘kitalic_k-center clustering for data summarization. In Proceedings of the International Conference on Machine Learning (ICML), pages 3448–3457, 2019.
  33. The matroid median problem. In Proceedings of the Symposium on Discrete Algorithms (SODA), pages 1117–1130, 2011.
  34. Constant approximation for k𝑘kitalic_k-median and k𝑘kitalic_k-means with outliers via iterative rounding. In Proceedings of the Symposium on Theory of Computing (STOC), pages 646–659, 2018.
  35. S. Mahabadi and A. Vakilian. Individual fairness for k𝑘kitalic_k-clustering. In Proceedings of the International Conference on Machine Learning (ICML), pages 6586–6596, 2020.
  36. Performance of Johnson-Lindenstrauss transform for k𝑘kitalic_k-means and k𝑘kitalic_k-medians clustering. In Symposium on Theory of Computing (STOC), pages 1027–1038, 2019.
  37. Y. Makarychev and A. Vakilian. Approximation algorithms for socially fair clustering. In Conference on Learning Theory (COLT), pages 3246–3264. PMLR, 2021.
  38. E. Micha and N. Shah. Proportionally fair clustering revisited. In International Colloquium on Automata, Languages, and Programming (ICALP), 2020.
  39. Abstracting of legal cases: the potential of clustering based on the selection of representative objects. Journal of the American Society for Information Science, 50(2):151–161, 1999.
  40. M. Negahbani and D. Chakrabarty. Better algorithms for individually fair k𝑘kitalic_k-clustering. Advances in Neural Information Processing Systems (NeurIPS), 34:13340–13351, 2021.
  41. Fair range k𝑘kitalic_k-center. arXiv preprint arXiv:2207.11337, 2022.
  42. C. Swamy. Improved approximation algorithms for matroid and knapsack median problems and applications. ACM Transactions on Algorithms (TALG), 12(4):1–22, 2016.
  43. Diversity-aware k𝑘kitalic_k-median: Clustering with fair center representation. In Machine Learning and Knowledge Discovery in Databases, pages 765–780, 2021.
  44. A. Vakilian and M. Yalçıner. Improved approximation algorithms for individually fair clustering. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 8758–8779. PMLR, 2022.
  45. Minimum cost flows, MDPs, and ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-regression in nearly linear time for dense instances. In Symposium on Theory of Computing (STOC), pages 859–869, 2021.
Citations (9)

Summary

We haven't generated a summary for this paper yet.