GSL-LPA: Fast Label Propagation Algorithm (LPA) for Community Detection with no Internally-Disconnected Communities (2403.01261v3)
Abstract: Community detection is the problem of identifying tightly connected clusters of nodes within a network. Efficient parallel algorithms for this play a crucial role in various applications, especially as datasets expand to significant sizes. The Label Propagation Algorithm (LPA) is commonly employed for this purpose due to its ease of parallelization, rapid execution, and scalability - however, it may yield internally disconnected communities. This technical report introduces GSL-LPA, derived from our parallelization of LPA, namely GVE-LPA. Our experiments on a system with two 16-core Intel Xeon Gold 6226R processors show that GSL-LPA not only mitigates this issue but also surpasses FLPA, igraph LPA, and NetworKit LPA by 55x, 10, 300x, and 5.8x, respectively, achieving a processing rate of 844M edges/s on a 3.8B edge graph. Additionally, GSL-LPA scales at a rate of 1.6x for every doubling of threads.
- Emmanuel Abbe. 2018. Community detection and stochastic block models: recent developments. Journal of Machine Learning Research 18, 177 (2018), 1–86.
- Efficient and principled method for detecting communities in networks. Physical Review E 84, 3 (2011), 036103.
- K. Berahmand and A. Bouyer. 2018. LP-LPA: A link influence-based label propagation algorithm for discovering community structures in networks. International Journal of Modern Physics B 32, 06 (10 mar 2018), 1850062.
- Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (Oct 2008), P10008.
- On modularity clustering. IEEE transactions on knowledge and data engineering 20, 2 (2007), 172–188.
- B. Chatterjee and H. Saha. 2019. Detection of communities in large scale networks. In IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE, 1051–1060.
- Finding community structure in very large networks. Physical review E 70, 6 (2004), 066111.
- Gennaro Cordasco and Luisa Gargano. 2012. Label propagation algorithm: a semi-synchronous approach. International Journal of Social Network Mining 1, 1 (2012), 3–26.
- A classification for community discovery methods in complex networks. Statistical Analysis and Data Mining: The ASA Data Science Journal 4, 5 (2011), 512–546.
- The igraph software package for complex network research. InterJournal, complex systems 1695, 5 (2006), 1–9.
- Jordi Duch and Alex Arenas. 2005. Community detection in complex networks using extremal optimization. Physical review E 72, 2 (2005), 027104.
- S. Fortunato. 2010. Community detection in graphs. Physics reports 486, 3-5 (2010), 75–174.
- Sara E Garza and Satu Elisa Schaeffer. 2019. Community detection with the label propagation algorithm: a survey. Physica A: Statistical Mechanics and its Applications 534 (2019), 122058.
- Effective Graph-Neural-Network based Models for Discovering Structural Hole Spanners in Large-Scale and Diverse Networks. arXiv preprint arXiv:2302.12442 (2023).
- S. Gregory. 2010. Finding overlapping communities in networks by label propagation. New Journal of Physics 12 (10 2010), 103018. Issue 10.
- A community discovery algorithm based on boundary nodes and label propagation. Pattern Recognition Letters 109 (2018), 103–109.
- BNEM: a fast community detection algorithm using generative models. Social Network Analysis and Mining 4 (2014), 1–20.
- Detecting communities in complex networks using an adaptive genetic algorithm and node similarity-based encoding. Complexity 2023 (2022).
- A. Karataş and S. Şahin. 2018. Application Areas of Community Detection: A Review. 65–70 pages.
- K. Kloster and D. Gleich. 2014. Heat kernel based community detection. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, USA, 1386–1395.
- The SuiteSparse matrix collection website interface. The Journal of Open Source Software 4, 35 (Mar 2019), 1244.
- IM-ELPR: Influence maximization in social networks using label propagation based community structure. Applied Intelligence (2021), 1–19.
- Parallelizing SLPA for scalable overlapping community detection. Scientific Programming 2015 (2015), 4–4.
- J. Leskovec. 2021. CS224W: Machine Learning with Graphs — 2021 — Lecture 13.3 - Louvain Algorithm. https://www.youtube.com/watch?v=0zuiLBOIcsw
- Towards real-time community detection in large networks. Physical Review E 79, 6 (2009), 066107.
- Direction-optimizing label propagation and its application to community detection. In Proceedings of the 17th ACM International Conference on Computing Frontiers. ACM, New York, NY, USA, 192–201.
- Malte Luecken. 2016. Application of multi-resolution partitioning of interaction networks to the study of complex disease. Ph. D. Dissertation. University of Oxford.
- M. Newman. 2006a. Finding community structure in networks using the eigenvectors of matrices. Physical review E 74, 3 (2006), 036104.
- Mark EJ Newman. 2006b. Modularity and community structure in networks. Proceedings of the national academy of sciences 103, 23 (2006), 8577–8582.
- Near linear time algorithm to detect community structures in large-scale networks. Physical Review E 76, 3 (Sep 2007), 036106–1–036106–11.
- Jörg Reichardt and Stefan Bornholdt. 2006. Statistical mechanics of community detection. Physical review E 74, 1 (2006), 016110.
- M. Rosvall and C. Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the national academy of sciences 105, 4 (2008), 1118–1123.
- Subhajit Sahu. 2023a. GVE-Leiden: Fast Leiden Algorithm for Community Detection in Shared Memory Setting. arXiv preprint arXiv:2312.13936 (2023).
- S. Sahu. 2023b. GVE-Louvain: Fast Louvain Algorithm for Community Detection in Shared Memory Setting. arXiv preprint arXiv:2312.04876 (2023).
- Subhajit Sahu. 2023c. GVE-LPA: Fast Label Propagation Algorithm (LPA) for Community Detection in Shared Memory Setting. arXiv preprint arXiv:2312.08140 (2023).
- S. Sahu. 2023d. Selecting a suitable Parallel Label-propagation based algorithm for Disjoint Community Detection. arXiv preprint arXiv:2301.09125 (2023).
- Subhajit Sahu. 2024. Addressing Internally-Disconnected Communities in Leiden and Louvain Community Detection Algorithms. arXiv preprint arXiv:2402.11454 (2024).
- M. Sattari and K. Zamanifar. 2018. A spreading activation-based label propagation algorithm for overlapping community detection in dynamic social networks. Data & knowledge engineering 113 (Jan 2018), 155–170.
- Jyothish Soman and Ankur Narang. 2011. Fast community detection algorithm with gpus and multicore architectures. In 2011 IEEE International Parallel & Distributed Processing Symposium. IEEE, 568–579.
- A classification of community detection methods in social networks: a survey. International journal of general systems 50, 1 (Jan 2021), 63–91.
- NetworKit: A tool suite for large-scale complex network analysis. Network Science 4, 4 (2016), 508–530.
- Christian L Staudt and Henning Meyerhenke. 2015. Engineering parallel algorithms for community detection in massive networks. IEEE Transactions on Parallel and Distributed Systems 27, 1 (2015), 171–184.
- V.A. Traag and L. Šubelj. 2023. Large network community detection by fast label propagation. Scientific Reports 13, 1 (2023), 2701.
- From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports 9, 1 (Mar 2019), 5233.
- Overlapping community detection using seed set expansion. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2099–2108.
- PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome biology 20 (2019), 1–9.
- LabelrankT: Incremental community detection in dynamic networks via label propagation. In Proceedings of the Workshop on Dynamic Networks Management and Mining. ACM, New York, USA, 25–32.
- SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In IEEE 11th International Conference on Data Mining Workshops. IEEE, IEEE, Vancouver, Canada, 344–349.
- Jierui Xie and Boleslaw K Szymanski. 2011. Community detection using a neighborhood strength driven label propagation algorithm. In 2011 IEEE Network Science Workshop. IEEE, 188–195.
- A node influence based label propagation algorithm for community detection in networks. The Scientific World Journal 2014 (2014), 1–14.
- A three-stage algorithm on community detection in social networks. Knowledge-Based Systems 187 (2020), 104822.
- N. Zarayeneh and A. Kalyanaraman. 2021. Delta-Screening: A Fast and Efficient Technique to Update Communities in Dynamic Graphs. IEEE transactions on network science and engineering 8, 2 (Apr 2021), 1614–1629.