Consensus clustering in complex networks (1203.6093v1)

Published 27 Mar 2012 in physics.soc-ph, cs.IR, and cs.SI

Abstract: The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on the specific random seeds, initial conditions and tie-break rules adopted for their execution. Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Here we show that consensus clustering can be combined with any existing method in a self-consistent way, enhancing considerably both the stability and the accuracy of the resulting partitions. This framework is also particularly suitable to monitor the evolution of community structure in temporal networks. An application of consensus clustering to a large citation network of physics papers demonstrates its capability to keep track of the birth, death and diversification of topics.

Citations (777)

View on Semantic Scholar

Summary

The paper introduces a consensus clustering framework that refines community detection by combining multiple stochastic algorithm runs into a stable consensus partition.
It employs an iterative method on a consensus matrix to achieve higher accuracy, demonstrated by improved Normalized Mutual Information (NMI) on benchmark graphs.
The approach enhances stability in real-world networks like APS citation networks and temporal datasets, offering robust analysis for dynamic community structures.

Consensus Clustering in Complex Networks

Overview

The paper "Consensus clustering in complex networks" by Andrea Lancichinetti and Santo Fortunato provides a comprehensive exploration of the application of consensus clustering to community detection in complex networks. The primary focus is on addressing the non-deterministic nature of traditional clustering methods by leveraging consensus clustering to generate stable and accurate results.

Key Contributions

Lancichinetti and Fortunato introduce a framework that integrates consensus clustering with existing community detection methods to enhance the stability and accuracy of the resulting partitions. The technique is particularly beneficial in analyzing temporal networks and tracking the evolution of community structures within these networks.

Methodology

The authors describe a method wherein multiple runs of a given stochastic community detection algorithm are used to create a set of partitions. These partitions are then consolidated into a consensus matrix, reflecting the co-occurrence of vertices in the same cluster across different runs. By iteratively applying the clustering algorithm to the consensus matrix and refining it until convergence, a robust and stable partition is obtained.

Experimental Results

The effectiveness of the proposed approach is rigorously tested using artificial benchmark graphs with a known community structure. The authors utilize the LFR benchmark graphs, which display power-law distributions of vertex degree and community size, to demonstrate the superior performance of consensus clustering. The evaluation metrics include Normalized Mutual Information (NMI), which quantifies the similarity between the detected and true partitions.

Accuracy: The consensus clustering consistently yields higher NMI values compared to the individual runs of the clustering algorithms. This improvement is particularly notable for the method by Clauset et al., which shows significant enhancement even where the direct application of the method performs poorly.
Stability: Stability is assessed by comparing the NMI between consecutive consensus partitions and individual runs. For both the neural network of C. elegans and the American Physical Society (APS) citation network, consensus clustering demonstrated higher stability and consistency over increasing numbers of runs.

Practical and Theoretical Implications

Consensus clustering addresses the stochastic variability of traditional methods, providing more reliable community detection in large and complex networks. This approach mitigates the computational complexity and noise sensitivity inherent in many existing techniques. By enhancing stability and fidelity, consensus clustering facilitates the robust analysis of temporal networks and the tracking of dynamic changes in community structures over time.

Future Directions

The research opens avenues for further improvements in community detection:

Scalability: Enhancing the scalability of the consensus clustering framework for application to even larger datasets.
Temporal Networks: Developing more sophisticated techniques for tracking community evolution in networks with non-linear or irregular temporal patterns.
Algorithmic Integration: Exploring the integration of consensus clustering with other advanced community detection algorithms to further improve accuracy and stability.

Conclusion

Lancichinetti and Fortunato's work on consensus clustering presents a significant advancement in the field of community detection in complex networks. By addressing the variability and instability of traditional methods, this framework offers a more robust and consistent approach to uncovering the underlying structure of complex systems. The results demonstrate the effectiveness of consensus clustering in both synthetic and real-world networks, establishing a foundation for continued innovation in network analysis and community detection.

PDF Markdown