Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
12 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
37 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Community detection in node-attributed social networks: a survey (1912.09816v2)

Published 20 Dec 2019 in cs.SI, cs.LG, and cs.PF

Abstract: Community detection is a fundamental problem in social network analysis consisting in unsupervised dividing social actors (nodes in a social graph) with certain social connections (edges in a social graph) into densely knitted and highly related groups with each group well separated from the others. Classical approaches for community detection usually deal only with network structure and ignore features of its nodes (called node attributes), although many real-world social networks provide additional actors' information such as interests. It is believed that the attributes may clarify and enrich the knowledge about the actors and give sense to the communities. This belief has motivated the progress in developing community detection methods that use both the structure and the attributes of network (i.e. deal with a node-attributed graph) to yield more informative and qualitative results. During the last decade many such methods based on different ideas have appeared. Although there exist partial overviews of them, a recent survey is a necessity as the growing number of the methods may cause repetitions in methodology and uncertainty in practice. In this paper we aim at describing and clarifying the overall situation in the field of community detection in node-attributed social networks. Namely, we perform an exhaustive search of known methods and propose a classification of them based on when and how structure and attributes are fused. We not only give a description of each class but also provide general technical ideas behind each method in the class. Furthermore, we pay attention to available information which methods outperform others and which datasets and quality measures are used for their evaluation. Basing on the information collected, we make conclusions on the current state of the field and disclose several problems that seem important to be resolved in future.

Citations (214)

Summary

  • The paper provides a comprehensive survey of approaches that merge structural and attribute information to improve community detection.
  • It classifies techniques into early, simultaneous, and late fusion strategies, detailing various integration methodologies.
  • The survey emphasizes benchmarking challenges and points to future directions in embedding methods and standard evaluation protocols.

Overview of Community Detection in Node-Attributed Social Networks: A Survey

The paper by Petr Chunaev offers a comprehensive survey on the methodologies employed in the field of community detection within node-attributed social networks. Community detection plays a crucial role in analyzing social networks, where a social actor is represented as a node and their connections form the edges of the network graph. Traditional community detection approaches have predominantly focused on network structures, often disregarding node attributes, such as age, gender, or interests, which can provide additional insight into community formation. In contrast, recent advancements have recognized the potential of integrating both structure and attributes for more informative results.

Classification of Methods

The survey classifies community detection methods into three distinct categories based on when the network structure and node attributes are integrated during the detection process:

  1. Early Fusion Methods: These methods combine structural and attribute information before the community detection process.
    • Weight-based Methods: Convert attributes into a weighted graph and then apply graph clustering algorithms.
    • Distance-based Methods: Create a distance matrix using a function that fuses structural and attributive distance measures, which is subsequently used for clustering.
    • Node-Augmented Graph-Based Methods: Extend the original graph with additional attribute nodes to assist in detection.
    • Embedding-Based Methods: Utilize embedding techniques to obtain node representations that incorporate both structural and attribute information.
    • Pattern Mining-Based Methods: Identify motifs or patterns within the graph to guide the detection process.
  2. Simultaneous Fusion Methods: These methods achieve fusion concurrently with the community detection process.
    • Objective Function Modification: Modify the objective functions of classical clustering algorithms like Louvain to incorporate attribute considerations.
    • Metaheuristic-Based Methods: Employ metaheuristics such as genetic algorithms optimized for both structural and attributed community features.
    • NNMF-Based Methods: Apply non-negative matrix factorization techniques to simultaneously handle structure and attribute data.
    • Pattern Mining and Probabilistic Models: Use probabilistic models or pattern-driven approaches to reveal communities in an embedded manner.
    • Dynamical and Agent-Based Systems: Adopt models viewing community development as a dynamic interaction or negotiation process.
  3. Late Fusion Methods: Here, separate partitions derived from structural and attribute information are combined post-detection.
    • Consensus-Based Approaches: Use ensemble methods to achieve a unified community partition from separate structure- and attribute-driven partitions.
    • Switch-Based Methods: Make decisions on adopting structural or attribute partitions based on certain criteria or thresholds.

Evaluation and Comparison

The paper outlines various evaluation metrics employed in community detection, from Modularity and Entropy measures to Normalized Mutual Information (NMI) when ground truth is available. However, a significant observation is the lack of a unified approach to comparison and benchmarking, hindering the determination of a "best" method in the field. Moreover, the paper emphasizes the importance of computational complexity insights for practical applications, often overlooked in many studies.

Implications and Future Directions

The integration of structural and attribute data in community detection opens opportunities for more nuanced and informative analyses of social networks. Future directions could involve developing theoretical frameworks to better understand when fusion is beneficial and establishing standardized practices for method evaluation and comparison. Additionally, advancements in embedding-based methods and machine learning present promising avenues for enhancing detection capabilities in complex attributed networks.

In summary, the paper delivers a detailed survey of existing methods, highlights the current challenges in community detection within node-attributed social networks, and calls for a more structured and unified methodology for future research and application developments.