Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Explainable Clustering: A Constrained Declarative based Approach (2403.18101v1)

Published 26 Mar 2024 in cs.AI and cs.LG

Abstract: The domain of explainable AI is of interest in all Machine Learning fields, and it is all the more important in clustering, an unsupervised task whose result must be validated by a domain expert. We aim at finding a clustering that has high quality in terms of classic clustering criteria and that is explainable, and we argue that these two dimensions must be considered when building the clustering. We consider that a good global explanation of a clustering should give the characteristics of each cluster taking into account their abilities to describe its objects (coverage) while distinguishing it from the other clusters (discrimination). Furthermore, we aim at leveraging expert knowledge, at different levels, on the structure of the expected clustering or on its explanations. In our framework an explanation of a cluster is a set of patterns, and we propose a novel interpretable constrained clustering method called ECS for declarative clustering with Explainabilty-driven Cluster Selection that integrates structural or domain expert knowledge expressed by means of constraints. It is based on the notion of coverage and discrimination that are formalized at different levels (cluster / clustering), each allowing for exceptions through parameterized thresholds. Our method relies on four steps: generation of a set of partitions, computation of frequent patterns for each cluster, pruning clusters that violates some constraints, and selection of clusters and associated patterns to build an interpretable clustering. This last step is combinatorial and we have developed a Constraint-Programming (CP) model to solve it. The method can integrate prior knowledge in the form of user constraints, both before or in the CP model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. How to find a good explanation for clustering? Artificial Intelligence, page 103948, 2023.
  2. Constrained clustering: Advances in algorithms, theory, and applications. CRC Press, 2008.
  3. Interpretable clustering: an optimization approach. Machine Learning, 110(1):89–138, 2021.
  4. Near-optimal explainable k-means for all dimensions. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2580–2606. SIAM, 2022.
  5. Interpretable clustering via discriminative rectangle mixture model. In 2016 IEEE 16th international conference on data mining (ICDM), pages 823–828. IEEE, 2016.
  6. Descriptive clustering: Ilp and cp formulations with applications. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, pages 1263–1269, 2018.
  7. Explainable k-means and k-medians clustering. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, pages 12–18, 2020.
  8. The cluster description problem-complexity results, formulations and approximations. Advances in Neural Information Processing Systems, 31, 2018.
  9. Almost tight approximation algorithms for explainable clustering. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2641–2663. SIAM, 2022.
  10. Douglas H. Fisher. Knowledge acquisition via incremental conceptual clustering. Mach. Learn., 2(2):139–172, 1987.
  11. Exkmc: Expanding explainable k𝑘kitalic_k-means clustering. arXiv preprint arXiv:2006.02399, 2020.
  12. Optimal interpretable clustering using oblique decision trees. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 400–410, 2022.
  13. Nearly-tight and oblivious algorithms for explainable clustering. Advances in Neural Information Processing Systems, 34, 2021.
  14. Anchored constrained clustering ensemble. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2022.
  15. Tias Guns. Increasing modeling language convenience with a universal n-dimensional array, cppy as python-embedded example. In Proceedings of the 18th workshop on Constraint Modelling and Reformulation at CP (Modref 2019), volume 19, 2019.
  16. L. Hubert and P. Arabie. Comparing partitions. Journal of classification, 2(1):193–218, 1985.
  17. On tackling explanation redundancy in decision trees. Journal of Artificial Intelligence Research, 75:261–321, 2022.
  18. On the price of explainability for some clustering problems. In International Conference on Machine Learning, pages 5915–5925. PMLR, 2021.
  19. Shallow decision trees for explainable k-means clustering. Pattern Recognition, 137:109239, 2023.
  20. Cluster explanation via polyhedral descriptions. In International Conference on Machine Learning, pages 18652–18666. PMLR, 2023.
  21. Interpretable clustering via multi-polytope machines. arXiv preprint arXiv:2112.05653, 2021.
  22. An explainable artificial intelligence model for clustering numerical databases. IEEE Access, 8:52370–52384, 2020.
  23. Explainable k-means. don’t be greedy, plant bigger trees! arXiv preprint arXiv:2111.03193, 2021.
  24. Near-optimal algorithms for explainable k-medians and k-means. In International Conference on Machine Learning, pages 7358–7367. PMLR, 2021.
  25. Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Trans. Pattern Anal. Mach. Intell., 5(4):396–410, 1983.
  26. Integer linear programming models for constrained clustering. In International Conference on Discovery Science, pages 159–173. Springer, 2010.
  27. Efficiently finding conceptual clustering models with integer linear programming. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 647–654, 2016.
  28. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830, 2011.
  29. Mixtures of rectangles: Interpretable soft clustering. In ICML, volume 2001, pages 401–408, 2001.
  30. Handbook of Constraint Programming. Foundations of Artificial Intelligence. Elsevier B.V., Amsterdam, Netherlands, August 2006.
  31. Efficient algorithms for generating provably near-optimal cluster descriptors for explainability. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 1636–1643, 2020.
  32. Lcm: An efficient algorithm for enumerating frequent closed item sets. In Fimi, volume 90, 2003.
  33. Constraint-based clustering selection. Machine Learning, 106:1497–1521, 2017.
  34. Constrained k-means clustering with background knowledge. In Icml, volume 1, pages 577–584, 2001.
  35. Combined constraint-based with metric-based in semi-supervised clustering ensemble. International Journal of Machine Learning and Cybernetics, 9(7):1085–1100, 2018.
  36. Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(9):2251–2265, 2019.
  37. Semi-supervised clustering ensemble based on multi-ant colonies algorithm. In International Conference on Rough Sets and Knowledge Technology, pages 302–309. Springer, 2012.
  38. Cluster ensemble selection with constraints. Neurocomputing, 235:59–70, 2017.
  39. Constraint projections for semi-supervised spectral clustering ensemble. Concurrency and Computation: Practice and Experience, 31(20):e5359, 2019.
  40. Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11(4):727–740, 2014.
  41. Adaptive ensembling of semi-supervised clustering solutions. IEEE Transactions on Knowledge and Data Engineering, 29(8):1577–1590, 2017.
  42. Semi-supervised ensemble clustering based on selected constraint projection. IEEE Transactions on Knowledge and Data Engineering, 30(12):2394–2407, 2018.
  43. Deep descriptive clustering. In Zhi-Hua Zhou, editor, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 3342–3348. International Joint Conferences on Artificial Intelligence Organization, 8 2021. Main Track.
Citations (2)

Summary

We haven't generated a summary for this paper yet.