Incremental Affinity Propagation based on Cluster Consolidation and Stratification (2401.14439v1)
Abstract: Modern data mining applications require to perform incremental clustering over dynamic datasets by tracing temporal changes over the resulting clusters. In this paper, we propose A-Posteriori affinity Propagation (APP), an incremental extension of Affinity Propagation (AP) based on cluster consolidation and cluster stratification to achieve faithfulness and forgetfulness. APP enforces incremental clustering where i) new arriving objects are dynamically consolidated into previous clusters without the need to re-execute clustering over the entire dataset of objects, and ii) a faithful sequence of clustering results is produced and maintained over time, while allowing to forget obsolete clusters with decremental learning functionalities. Four popular labeled datasets are used to test the performance of APP with respect to benchmark clustering performances obtained by conventional AP and Incremental Affinity Propagation based on Nearest neighbor Assignment (IAPNA) algorithms. Experimental results show that APP achieves comparable clustering performance while enforcing scalability at the same time.
- A Survey of Stream Clustering Algorithms, in: Data Clustering. Chapman and Hall/CRC, pp. 231–258.
- An Evolutionary Clustering Analysis of Social Media Content and Global Infection Rates During the COVID-19 Pandemic. Journal of Information & Knowledge Management 20, 2150038. URL: https://doi.org/10.1142/S0219649221500386, doi:https://doi.org/10.1142/S0219649221500386.
- Evolutionary Affinity Propagation, in: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2681–2685. URL: https://ieeexplore.ieee.org/document/7952643, doi:10.1109/ICASSP.2017.7952643.
- Evolutionary Clustering via Message Passing. IEEE Transactions on Knowledge and Data Engineering (TKDE) 33, 2452–2466. URL: https://ieeexplore.ieee.org/document/8908802, doi:10.1109/TKDE.2019.2954869.
- Online Clustering of Parallel Data Streams. Data & Knowledge Engineering (DKE) 58, 180–204. URL: https://www.sciencedirect.com/science/article/pii/S0169023X05000819, doi:https://doi.org/10.1016/j.datak.2005.05.009.
- Semantic Shift Detection in Vatican Publications: a Case Study from Leo XIII to Francis, in: Proceedings of the 30th Italian Symposium on Advanced Database Systems (SEBD), CEUR-WS, Pisa, Italy. pp. 231–243. URL: https://ceur-ws.org/Vol-3194/paper29.pdf.
- Evolutionary Clustering, in: Proceedings of the 12th ACM International Conference on Knowledge Discovery and Data Mining (KDD), Association for Computing Machinery, Philadelphia, PA, USA. p. 554–560. URL: https://doi.org/10.1145/1150402.1150467, doi:10.1145/1150402.1150467.
- BERT: Pre-training of deep bidirectional transformers for language understanding, in: Burstein, J., Doran, C., Solorio, T. (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota. pp. 4171–4186. URL: https://aclanthology.org/N19-1423, doi:10.18653/v1/N19-1423.
- Clustering by Passing Messages Between Data Points. science 315, 972–976. URL: https://www.science.org/doi/10.1126/science.1136800, doi:doi.org/10.1126/science.1136800.
- Tracing Evolving Subspace Clusters in Temporal Climate Data. Data Mining and Knowledge Discovery (DMKD) 24, 387–410. URL: https://link.springer.com/article/10.1007/s10618-011-0237-7, doi:doi.org/10.1007/s10618-011-0237-7.
- A Survey of Evolutionary Algorithms for Clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 39, 133–155. URL: https://ieeexplore.ieee.org/document/4783080, doi:10.1109/TSMCC.2008.2007252.
- Diachronic word embeddings and semantic shifts: a survey, in: Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics, Santa Fe, New Mexico, USA. pp. 1384–1397. URL: https://aclanthology.org/C18-1117.
- An evaluation of data stream clustering algorithms. Statistical Analysis and Data Mining: The ASA Data Science Journal 11, 167–187. URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/sam.11380, doi:https://doi.org/10.1002/sam.11380, arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/sam.11380.
- Capturing Evolution in Word Usage: Just Add More Clusters?. Association for Computing Machinery, New York, NY, USA. p. 343–349. URL: https://doi.org/10.1145/3366424.3382186, doi:doi.org/10.1145/3366424.3382186.
- Discovering Evolutionary Theme Patterns from Text: An Exploration of Temporal Text Mining, in: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Association for Computing Machinery, Chicago, Illinois, USA. p. 198–207. URL: https://doi.org/10.1145/1081870.1081895, doi:10.1145/1081870.1081895.
- A Survey on Contextualised Semantic Shift Detection. URL: https://arxiv.org/pdf/2304.01666.pdf, doi:https://doi.org/10.48550/arXiv.2304.01666, arXiv:2304.01666.
- UCI Repository of Machine Learning Databases. URL: http://www.ics.uci.edu/~mlearn/MLRepository.html.
- EvolveCluster: an evolutionary clustering algorithm for streaming data. Evolving Systems , 1--21URL: https://doi.org/10.1007/s12530-021-09408-y, doi:doi.org/10.1007/s12530-021-09408-y.
- Unsupervised Incremental Learning for Long-term Autonomy, in: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 4022--4029. doi:10.1109/ICRA.2012.6224605.
- What is Done is Done: an Incremental Approach to Semantic Shift Detection, in: Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change, Association for Computational Linguistics, Dublin, Ireland. pp. 33--43. URL: https://aclanthology.org/2022.lchange-1.4, doi:10.18653/v1/2022.lchange-1.4.
- Studying Word Meaning Evolution through Incremental Semantic Shift Detection: A Case Study of Italian Parliamentary Speeches. URL: https://www.techrxiv.org/doi/full/10.36227/techrxiv.24210915.v1, doi:10.36227/techrxiv.24210915.v1.
- SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection, in: Proceedings of the 14th Workshop on Semantic Evaluation (SemEval), International Committee for Computational Linguistics, Barcelona (online). pp. 1--23. URL: https://aclanthology.org/2020.semeval-1.1, doi:10.18653/v1/2020.semeval-1.1.
- An Incremental Affinity Propagation Algorithm and its Applications for Text Clustering, in: Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 2914--2919. doi:10.1109/IJCNN.2009.5178973.
- Breast Lesion Segmentation in DCE-MRI using Multi-Objective Clustering with NSGA-II, in: Proceedings of the International Conference on Innovative Trends in Information Technology (ICITIIT), pp. 1--6. URL: https://ieeexplore.ieee.org/document/9744148, doi:10.1109/ICITIIT54346.2022.9744148.
- Incremental Affinity Propagation Clustering Based on Message Passing. IEEE Transactions on Knowledge and Data Engineering (TKDE) 26, 2731--2744. doi:10.1109/TKDE.2014.2310215.
- Evolution and Affinity-Propagation Based Approach for Data Stream Clustering, in: Proceedings of the International Conference on Frontiers of Educational Technologies (ICFET), p. 97–101. URL: https://dl.acm.org/doi/10.1145/3233347.3233382, doi:10.1145/3233347.3233382.
- Survey of Computational Approaches to Lexical Semantic Change Detection. Language Science Press, Berlin. pp. 1--91. doi:10.5281/zenodo.5040302.
- SED-Stream: Discriminative Dimension Selection for Evolution-Based Clustering of High Dimensional Data Streams. International Journal of Intelligent Systems Technologies and Applications (IJISTA) 13, 187–201. URL: https://doi.org/10.1504/IJISTA.2014.065174, doi:10.1504/IJISTA.2014.065174.
- Incremental and Decremental Affinity Propagation for Semisupervised Clustering in Multispectral Images. IEEE Transactions on Geoscience and Remote Sensing (TGRS) 51, 1666--1679. doi:10.1109/TGRS.2012.2206818.
- Frugal and Online Affinity Propagation, in: Proceedings of the Conférence francophone sur l’Apprentissage (CAP). URL: https://inria.hal.science/inria-00287381/document.