Truss Decomposition in Massive Networks (1205.6693v1)

Published 30 May 2012 in cs.DB

Abstract: The k-truss is a type of cohesive subgraphs proposed recently for the study of networks. While the problem of computing most cohesive subgraphs is NP-hard, there exists a polynomial time algorithm for computing k-truss. Compared with k-core which is also efficient to compute, k-truss represents the "core" of a k-core that keeps the key information of, while filtering out less important information from, the k-core. However, existing algorithms for computing k-truss are inefficient for handling today's massive networks. We first improve the existing in-memory algorithm for computing k-truss in networks of moderate size. Then, we propose two I/O-efficient algorithms to handle massive networks that cannot fit in main memory. Our experiments on real datasets verify the efficiency of our algorithms and the value of k-truss.

Citations (360)

View on Semantic Scholar

Summary

The paper introduces novel algorithms for k-truss decomposition, significantly improving efficiency in analyzing massive networks.
It presents an improved in-memory approach with O(m^1.5) complexity alongside I/O-efficient bottom-up and top-down strategies for large datasets.
Results on real-world networks like LiveJournal validate that k-truss methods capture more cohesive community structures than k-core models.

Analyzing Truss Decomposition in Massive Networks

The complexity and vastness of today's networks have prompted an academical focus on identifying cohesive substructures within these networks. The paper in consideration addresses the truss decomposition problem, which seeks to identify all k-trusses within a network. A k-truss is defined as the largest subgraph where every edge is part of at least (k-2) triangles. This concept refines the k-core methodology by leveraging triangle support as its fundamental building block, thereby ensuring a more cohesive subgraph is identified. Unlike cliques, which demand all vertices are directly connected, k-trusses offer a less rigid approach suitable for large, sparse graphs.

The k-truss problem is polynomially solvable, distinguishing it from NP-hard subgraph challenges. The paper critiques existing algorithms for their inefficiency with large networks that cannot fully reside in memory, such as Cohen's MapReduce approach which suffers from prohibitive I/O overhead due to repeated iterations over the graph data.

To address this, the authors introduce two strategies. An improved in-memory algorithm for moderate-sized networks builds on existing methodologies by optimizing edge support ordering, decreasing complexity to O(m^1.5). For colossal networks, they propose I/O-efficient algorithms using both bottom-up and top-down strategies.

The bottom-up approach commences from minimal k-values, progressively extracting k-trusses by employing a lower bound estimation for edge truss numbers. This strategy efficiently reduces the search space by pruning less relevant elements. The top-down strategy begins with the maximum known truss number, focusing on deriving the most structurally integral subgraphs first, which can be particularly advantageous when only the topmost trusses are of interest.

Strong points in the paper include demonstrated algorithm efficiency using real-world data, showing significantly superior performance of the presented methods over existing algorithms across various large datasets. Notable datasets such as the LiveJournal network (consisting of millions of nodes) illustrate the scalability of their approach.

The research reaffirms k-truss as a compelling alternative to k-core, as it discards edges with lower local clustering coefficients, reflecting more substantial sub-community formation. By advancing truss decomposition algorithms, this research empowers more intricate and meaningful analysis of networks, especially in fields requiring robust community detection like social network analysis or bioinformatics. Future research directions may revolve around fine-tuning truss decomposition further for weighted networks and dynamic environments where network topology frequently evolves.

In conclusion, this work extends the framework for network cohesion analysis with truss decomposition, offering computationally viable algorithms vital for processing the expansive graphs typical of modern big data environments. This progression promises enhanced interpretability and applicability in various complex systems, heralding more informed advancements in network science.

PDF Markdown

Related Papers

Efficient Truss Maintenance in Evolving Networks (2014)
Shared-memory Graph Truss Decomposition (2017)
Higher-Order Neighborhood Truss Decomposition (2021)
Efficient Estimation of Graph Trussness (2020)
Fast Hierarchy Construction for Dense Subgraphs (2016)