Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Scalable Generative Graph Model with Community Structure

Published 27 Feb 2013 in cs.SI and physics.soc-ph | (1302.6636v3)

Abstract: Network data is ubiquitous and growing, yet we lack realistic generative network models that can be calibrated to match real-world data. The recently proposed Block Two-Level Erdss-Renyi (BTER) model can be tuned to capture two fundamental properties: degree distribution and clustering coefficients. The latter is particularly important for reproducing graphs with community structure, such as social networks. In this paper, we compare BTER to other scalable models and show that it gives a better fit to real data. We provide a scalable implementation that requires only O(d_max) storage where d_max is the maximum number of neighbors for a single node. The generator is trivially parallelizable, and we show results for a Hadoop MapReduce implementation for a modeling a real-world web graph with over 4.6 billion edges. We propose that the BTER model can be used as a graph generator for benchmarking purposes and provide idealized degree distributions and clustering coefficient profiles that can be tuned for user specifications.

Citations (165)

Summary

  • The paper introduces the Block Two-Level Erdős-Rényi (BTER) model, a scalable generative graph model designed to accurately replicate degree distribution and clustering coefficients in large real-world networks with community structure.
  • The BTER model achieves scalability through a two-level approach combining dense affinity blocks for local clustering and global connections, enabling efficient parallelization suitable for generating massive graphs up to billions of edges.
  • Comparative analysis shows the BTER model provides a superior fit to real-world data compared to models like SKG and Chung-Lu, particularly in reproducing clustering coefficients, and serves as a practical benchmark for network simulation.

Overview of "A Scalable Generative Graph Model with Community Structure"

The paper "A Scalable Generative Graph Model with Community Structure" by Kolda et al. introduces a novel approach to modeling large-scale network data: the Block Two-Level Erdős-Rényi (BTER) model. This model aims to replicate key properties of real-world networks, particularly focusing on the degree distribution and clustering coefficients, essential characteristics for networks exhibiting community structure, such as social networks. The BTER model is proposed as an improvement over existing models and offers scalable implementations suitable for handling extensive datasets.

Key Contributions and Methodology

  1. BTER Model Description:
    • The BTER model divides nodes into "affinity blocks," which capture local clustering through the generation of dense subgraphs. Each block comprises nodes that have a high probability of being interconnected, reflecting communities within the network.
    • The model accounts for global connections across blocks using principles similar to the Chung-Lu model, creating edges based on nodes' residual degrees after local connections.
  2. Scalability and Parallelization:
    • The BTER algorithm is designed to be highly scalable, requiring only log-scale operations per edge. It is efficiently parallelizable, allowing for practical implementation using distributed computing frameworks such as Hadoop MapReduce.
    • The BTER generator is prepared for real-world applications, demonstrated by generating a web graph with over 4.6 billion edges.
  3. Implementation Details and Comparison:
    • The paper provides a detailed and transparent implementation guide, addressing issues such as parameter selection, edge duplication, and degree-1 node handling.
    • Comparative analysis against alternate models like Stochastic Kronecker Graph (SKG) and Chung-Lu models is conducted, showing BTER’s superior fit to real-world data, particularly in clustering coefficients—an area where competing models fall short.
  4. Practical and Theoretical Implications:
    • From a practical standpoint, BTER can serve as a benchmark generator, providing idealized degree distributions and clustering coefficient profiles adjustable via user specifications.
    • The model’s flexibility allows it to serve various domains requiring complex network generation, including simulation and testing of graph-processing algorithms.
  5. Future Directions:
    • The paper hints at potential enhancements, such as extending the current model to handle directed and weighted graphs, which would further deepen its applicability in diverse network data scenarios.
    • Addressing more complex community structures and exploring temporal dynamics are identified as intriguing avenues for future exploration.

In conclusion, the BTER model stands as a comprehensive and adaptable tool for generating realistic large-scale graphs, fulfilling a critical need in network science research for models combining scalability with fidelity to real-world structures. This work significantly advances the ability to simulate vast networks and study complex phenomena within them efficiently.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.