From Louvain to Leiden: guaranteeing well-connected communities (1810.08473v3)

Published 19 Oct 2018 in cs.SI and physics.soc-ph

Abstract: Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected communities. In the worst case, communities may even be disconnected, especially when running the algorithm iteratively. In our experimental analysis, we observe that up to 25% of the communities are badly connected and up to 16% are disconnected. To address this problem, we introduce the Leiden algorithm. We prove that the Leiden algorithm yields communities that are guaranteed to be connected. In addition, we prove that, when the Leiden algorithm is applied iteratively, it converges to a partition in which all subsets of all communities are locally optimally assigned. Furthermore, by relying on a fast local move approach, the Leiden algorithm runs faster than the Louvain algorithm. We demonstrate the performance of the Leiden algorithm for several benchmark and real-world networks. We find that the Leiden algorithm is faster than the Louvain algorithm and uncovers better partitions, in addition to providing explicit guarantees.

Citations (2,908)

View on Semantic Scholar

Summary

The paper identifies a major flaw in the Louvain algorithm, showing up to 25% of communities can be weakly connected, and presents Leiden as a remedy.
The study employs refined local move strategies and network aggregation to guarantee γ-connectivity and higher modularity in community detection.
Empirical validation on large-scale networks demonstrates that Leiden is up to 20 times faster while consistently achieving superior community structure.

Analyzing the Louvain and Leiden Algorithms for Community Detection

This paper addresses inherent problems within the Louvain algorithm, a widely recognized method for community detection in complex networks, and introduces the Leiden algorithm as a solution. The researchers systematically identify and remedy a critical defect of the Louvain method, expanding the theoretical and practical boundaries of community detection.

Louvain Algorithm: Strengths and Weaknesses

Community detection is a pivotal task in network science, enabling the dissection of a network into clusters or communities. The Louvain algorithm, developed to maximize modularity, is a popular choice due to its simplicity and efficiency. However, the authors uncover a significant flaw in the Louvain algorithm: it can produce communities that are internally disconnected or weakly connected, a phenomenon that compromises the reliability of the detected community structure.

Despite its performance in various comparative analyses, where it often excels in speed and modularity optimization, the Louvain algorithm does not guarantee the connectivity of communities. Through experimental validation, the authors report that up to 25% of the detected communities could be badly connected, and up to 16% might be inherently disconnected.

Introduction of the Leiden Algorithm

To address these issues, the authors propose the Leiden algorithm, which integrates improvements from previous methods, such as smart local move strategies and fast local move heuristics. The Leiden algorithm enhances the community detection process by ensuring that communities are well connected and providing stronger theoretical guarantees.

Phases in the Leiden Algorithm

Local Moving of Nodes: Nodes are moved to different communities to optimize the quality function.
Refinement of the Partition: Iteratively refines the detected communities ensuring that each sub-community is well connected.
Aggregation of the Network: Based on the refined partition, an aggregate network is constructed, which forms the basis for further iterations.

Guarantees and Performance

Theoretical Guarantees

The Leiden algorithm offers various theoretical guarantees, which the authors rigorously prove:

Per Iteration: The algorithm guarantees both γ-separation and γ-connectivity of the communities. γ-separation ensures that no communities can be merged, while γ-connectivity ensures the internal connectivity of each community.
Stable Iterations: Each stable iteration (where no further improvements can be made) guarantees that the communities are subpartition γ-dense and node optimal.
Asymptotic Performance: As the algorithm iteratively refines the communities, it converges to a state where all detected communities are subset optimal and uniformly γ-dense. This indicates that no subset within a community can be moved to enhance the modular structure further.

Empirical Validation

The paper provides robust empirical analysis across several large real-world networks, such as DBLP, Amazon, and the Web of Science. The Leiden algorithm is consistently faster than Louvain, showing a drastic improvement, especially in larger networks (up to 20 times faster). Moreover, it finds communities with higher modularity scores, proving its practical efficacy.

Implications and Future Directions

The implications of the Leiden algorithm are profound for both theoretical exploration and practical applications. It establishes a new standard for community detection methods, with potential future developments including optimized implementations for various types of networks like weighted, directed, or temporal networks. Moreover, the algorithm’s robust performance can significantly impact analytical tasks in domains including biology, neuroscience, and bibliometrics.

Conclusion

The paper effectively addresses a critical shortcoming in community detection with the Louvain algorithm by introducing the Leiden algorithm, which provides substantial improvements in speed, quality, and theoretical guarantees for well-connected communities. The advancements presented extend the state-of-the-art in network science, rendering the Leiden algorithm a highly recommended tool for network analysts.

This detailed exploration of the algorithmic enhancements provides a solid foundation for future research and applications, signaling considerable progress in the field of community detection.

PDF Markdown