A Unified Framework for Fair Spectral Clustering With Effective Graph Learning (2311.13766v1)
Abstract: We consider the problem of spectral clustering under group fairness constraints, where samples from each sensitive group are approximately proportionally represented in each cluster. Traditional fair spectral clustering (FSC) methods consist of two consecutive stages, i.e., performing fair spectral embedding on a given graph and conducting $k$means to obtain discrete cluster labels. However, in practice, the graph is usually unknown, and we need to construct the underlying graph from potentially noisy data, the quality of which inevitably affects subsequent fair clustering performance. Furthermore, performing FSC through separate steps breaks the connections among these steps, leading to suboptimal results. To this end, we first theoretically analyze the effect of the constructed graph on FSC. Motivated by the analysis, we propose a novel graph construction method with a node-adaptive graph filter to learn graphs from noisy data. Then, all independent stages of conventional FSC are integrated into a single objective function, forming an end-to-end framework that inputs raw data and outputs discrete cluster labels. An algorithm is developed to jointly and alternately update the variables in each stage. Finally, we conduct extensive experiments on synthetic, benchmark, and real data, which show that our model is superior to state-of-the-art fair clustering methods.
- T. Lei, X. Jia, Y. Zhang, S. Liu, H. Meng, and A. K. Nandi, “Superpixel-based fast fuzzy c-means clustering for color image segmentation,” IEEE Trans. Fuzzy Syst., vol. 27, no. 9, pp. 1753–1766, 2018.
- H. Xie, A. Zhao, S. Huang, J. Han, S. Liu, X. Xu, X. Luo, H. Pan, Q. Du, and X. Tong, “Unsupervised hyperspectral remote sensing image clustering based on adaptive density,” IEEE Geosci. Remote S., vol. 15, no. 4, pp. 632–636, 2018.
- V. Y. Kiselev, T. S. Andrews, and M. Hemberg, “Challenges in unsupervised clustering of single-cell rna-seq data,” Nat. Rev. Genet., vol. 20, no. 5, pp. 273–282, 2019.
- A. Likas, N. Vlassis, and J. J. Verbeek, “The global k-means clustering algorithm,” Pattern Recognit., vol. 36, no. 2, pp. 451–461, 2003.
- U. Von Luxburg, “A tutorial on spectral clustering,” Stat. Comput., vol. 17, pp. 395–416, 2007.
- W.-B. Xie, Y.-L. Lee, C. Wang, D.-B. Chen, and T. Zhou, “Hierarchical clustering supported by reciprocal nearest neighbors,” Inf. Sci., vol. 527, pp. 279–292, 2020.
- A. Chouldechova and A. Roth, “The frontiers of fairness in machine learning,” arXiv:1810.08810, 2018.
- F. Chierichetti, R. Kumar, S. Lattanzi, and S. Vassilvitskii, “Fair clustering through fairlets,” Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017.
- S. Bera, D. Chakrabarty, N. Flores, and M. Negahbani, “Fair algorithms for clustering,” Proc. Adv. Neural Inf. Process. Syst., vol. 32, 2019.
- A. Backurs, P. Indyk, K. Onak, B. Schieber, A. Vakilian, and T. Wagner, “Scalable fair clustering,” in Proc. Int. Conf. Mach. Learn. PMLR, 2019, pp. 405–413.
- I. M. Ziko, J. Yuan, E. Granger, and I. B. Ayed, “Variational fair clustering,” in Proc. Natl. Conf. Artif. Intell., vol. 35, no. 12, 2021, pp. 11 202–11 209.
- P. Zeng, Y. Li, P. Hu, D. Peng, J. Lv, and X. Peng, “Deep fair clustering via maximizing and minimizing mutual information: Theory, algorithm and metric,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 23 986–23 995.
- P. Li, H. Zhao, and H. Liu, “Deep fair clustering for visual learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9070–9079.
- M. Kleindessner, S. Samadi, P. Awasthi, and J. Morgenstern, “Guarantees for spectral clustering with fairness constraints,” in Proc. Int. Conf. Mach. Learn. PMLR, 2019, pp. 3458–3467.
- J. Wang, D. Lu, I. Davidson, and Z. Bai, “Scalable spectral clustering with group fairness constraints,” in Proc. Int. Conf. Artif. Intell. Stat., AISTATS. PMLR, 2023, pp. 6613–6629.
- J. Li, Y. Wang, and A. Merchant, “Spectral normalized-cut graph partitioning with fairness constraints,” arXiv:2307.12065, 2023.
- S. Gupta and A. Dukkipati, “Protecting individual interests across clusters: Spectral clustering with guarantees,” arXiv: 2105.03714, 2021.
- Y. Wang, J. Kang, Y. Xia, J. Luo, and H. Tong, “ifig: Individually fair multi-view graph clustering,” in 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022, pp. 329–338.
- J. Huang, F. Nie, and H. Huang, “Spectral rotation versus k-means in spectral clustering,” in Proc. Natl. Conf. Artif. Intell., vol. 27, no. 1, 2013, pp. 431–437.
- Z. Kang, C. Peng, Q. Cheng, and Z. Xu, “Unified spectral clustering with optimal graph,” in Proc. Natl. Conf. Artif. Intell., vol. 32, no. 1, 2018.
- Z. Kang, C. Peng, and Q. Cheng, “Twin learning for similarity and clustering: A unified kernel approach,” in Proc. Natl. Conf. Artif. Intell., vol. 31, no. 1, 2017.
- J. Huang, F. Nie, and H. Huang, “A new simplex sparse learning model to measure data similarity for clustering,” in Int. Joint Conf. Artif. Intell., 2015.
- Y. Peng, W. Huang, W. Kong, F. Nie, and B.-L. Lu, “Jgsed: An end-to-end spectral clustering model for joint graph construction, spectral embedding and discretization,” IEEE Trans. Emerg. Topics Comput. Intell., 2023.
- E. Elhamifar and R. Vidal, “Sparse subspace clustering: Algorithm, theory, and applications,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 35, no. 11, pp. 2765–2781, 2013.
- G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, “Robust recovery of subspace structures by low-rank representation,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 35, no. 1, pp. 171–184, 2012.
- F. Nie, X. Wang, and H. Huang, “Clustering and projected clustering with adaptive neighbors,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2014, pp. 977–986.
- C. Gao, Y. Wang, J. Zhou, W. Ding, L. Shen, and Z. Lai, “Possibilistic neighborhood graph: A new concept of similarity graph learning,” IEEE Trans. Emerg. Topics Comput. Intell., 2022.
- D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, 2013.
- X. Dong, D. Thanou, P. Frossard, and P. Vandergheynst, “Learning Laplacian matrix in smooth graph signal representations,” IEEE Trans. Signal Process., vol. 64, no. 23, pp. 6160–6173, 2016.
- V. Kalofolias, “How to learn a graph from smooth signals,” in Proc. Int. Conf. Artif. Intell. Stat., AISTATS. PMLR, 2016, pp. 920–929.
- X. Dong, D. Thanou, M. Rabbat, and P. Frossard, “Learning graphs from data: A signal representation perspective,” IEEE Signal Process. Mag., vol. 36, no. 3, pp. 44–63, 2019.
- F. Nie, D. Wu, R. Wang, and X. Li, “Self-weighted clustering with adaptive neighbors,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 9, pp. 3428–3441, 2020.
- Y. Pang, J. Xie, F. Nie, and X. Li, “Spectral clustering by joint spectral embedding and spectral rotation,” IEEE Trans. Cybern., vol. 50, no. 1, pp. 247–258, 2018.
- Y. Yang, F. Shen, Z. Huang, and H. T. Shen, “A unified framework for discrete spectral clustering.” in IJCAI, 2016, pp. 2273–2279.
- W. Huang, Y. Peng, Y. Ge, and W. Kong, “A new kmeans clustering model and its generalization achieved by joint spectral embedding and rotation,” PeerJ Comput. Sci., vol. 7, p. e450, 2021.
- Y. Han, L. Zhu, Z. Cheng, J. Li, and X. Liu, “Discrete optimal graph clustering,” IEEE Trans. Cybern., vol. 50, no. 4, pp. 1697–1710, 2018.
- C. Tang, Z. Li, J. Wang, X. Liu, W. Zhang, and E. Zhu, “Unified one-step multi-view spectral clustering,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 6, pp. 6449–6460, 2022.
- F. Zhang, J. Zhao, X. Ye, and H. Chen, “One-step adaptive spectral clustering networks,” IEEE Signal Process. Lett., vol. 29, pp. 2263–2267, 2022.
- P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic blockmodels: First steps,” Soc. Networks, vol. 5, no. 2, pp. 109–137, 1983.
- J. Lei and A. Rinaldo, “Consistency of spectral clustering in stochastic block models,” Ann. Stat., vol. 43, no. 1, 2015.
- Q. Li, X.-M. Wu, H. Liu, X. Zhang, and Z. Guan, “Label efficient semi-supervised learning via graph filtering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 9582–9591.
- Y. Y. Pilavcı, P.-O. Amblard, S. Barthelmé, and N. Tremblay, “Graph tikhonov regularization and interpolation via random spanning forests,” IEEE Trans. Signal. Inf. Process. Netw., vol. 7, pp. 359–374, 2021.
- E. Pan and Z. Kang, “Multi-view contrastive graph clustering,” Proc. Adv. Neural Inf. Process. Syst., vol. 34, pp. 2148–2159, 2021.
- L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” J. Mach. Learn. Res., vol. 9, no. 11, 2008.
- G. Zhong and C.-M. Pun, “Self-taught multi-view spectral clustering,” Pattern Recognit., vol. 138, p. 109349, 2023.
- F. Nie, S. Shi, and X. Li, “Semi-supervised learning with auto-weighting feature and adaptive graph,” IEEE Trans. Knowl. Data Eng., vol. 32, no. 6, pp. 1167–1178, 2019.
- J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 22, no. 8, pp. 888–905, 2000.
- K. Fan, “On a theorem of weyl concerning eigenvalues of linear transformations i,” Proc. of the Nat. Academy. of Sci., vol. 35, no. 11, pp. 652–655, 1949.
- S. Kumar, J. Ying, J. V. de Miranda Cardoso, and D. P. Palomar, “A unified framework for structured graph learning via spectral constraints.” J. Mach. Learn. Res., vol. 21, no. 22, pp. 1–60, 2020.
- D. Wu, F. Nie, J. Lu, R. Wang, and X. Li, “Effective clustering via structured graph learning,” IEEE Trans. Knowl. Data Eng., 2022.
- E. Pircalabelu and G. Claeskens, “Community-based group graphical lasso,” J. Mach. Learn. Res., vol. 21, no. 1, pp. 2406–2437, 2020.
- S. S. Saboksayr and G. Mateos, “Accelerated graph learning from smooth signals,” IEEE Signal Process. Lett., vol. 28, pp. 2192–2196, 2021.
- Z. Wen and W. Yin, “A feasible method for optimization with orthogonality constraints,” Math. Program., vol. 142, pp. 397–434, 2013.
- P. H. Schönemann, “A generalized solution of the orthogonal procrustes problem,” Psychometrika, vol. 31, no. 1, pp. 1–10, 1966.
- O. Axelsson and G. Lindskog, “On the rate of convergence of the preconditioned conjugate gradient method,” Numer. Math., vol. 48, pp. 499–523, 1986.
- V. Kalofolias and N. Perraudin, “Large scale graph learning from smooth signals,” in Int. Conf. Learn. Representations, 2019.
- D. A. Tarzanagh, L. Balzano, and A. O. Hero, “Fair structure learning in heterogeneous graphical models,” arXiv:2112.05128, 2021.
- H. Wang, N. Wang, and D.-Y. Yeung, “Collaborative deep learning for recommender systems,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2015, pp. 1235–1244.
- X. Chen, G. Yuan, F. Nie, and Z. Ming, “Semi-supervised feature selection via sparse rescaled linear square regression,” IEEE Trans. Knowl. Data Eng., vol. 32, no. 1, pp. 165–176, 2018.
- L. Hagen and A. B. Kahng, “New spectral methods for ratio cut partitioning and clustering,” IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 11, no. 9, pp. 1074–1085, 1992.