Community detection thresholds and the weak Ramanujan property (1311.3085v1)

Published 13 Nov 2013 in cs.SI

Abstract: Decelle et al.\cite{Decelle11} conjectured the existence of a sharp threshold for community detection in sparse random graphs drawn from the stochastic block model. Mossel et al.\cite{Mossel12} established the negative part of the conjecture, proving impossibility of meaningful detection below the threshold. However the positive part of the conjecture remained elusive so far. Here we solve the positive part of the conjecture. We introduce a modified adjacency matrix $B$ that counts self-avoiding paths of a given length $\ell$ between pairs of nodes and prove that for logarithmic $\ell$, the leading eigenvectors of this modified matrix provide non-trivial detection, thereby settling the conjecture. A key step in the proof consists in establishing a {\em weak Ramanujan property} of matrix $B$. Namely, the spectrum of $B$ consists in two leading eigenvalues $\rho(B)$, $\lambda_2$ and $n-2$ eigenvalues of a lower order $O(n^{{\epsilon}\sqrt{\rho(B)})$} for all $\epsilon>0$, $\rho(B)$ denoting $B$'s spectral radius. $d$-regular graphs are Ramanujan when their second eigenvalue verifies $|\lambda|\le 2 \sqrt{d-1}$. Random $d$-regular graphs have a second largest eigenvalue $\lambda$ of $2\sqrt{d-1}+o(1)$ (see Friedman\cite{friedman08}), thus being {\em almost} Ramanujan. Erd\H{o}s-R\'enyi graphs with average degree $d$ at least logarithmic ($d=\Omega(\log n)$) have a second eigenvalue of $O(\sqrt{d})$ (see Feige and Ofek\cite{Feige05}), a slightly weaker version of the Ramanujan property. However this spectrum separation property fails for sparse ($d=O(1)$) Erd\H{o}s-R\'enyi graphs. Our result thus shows that by constructing matrix $B$ through neighborhood expansion, we regularize the original adjacency matrix to eventually recover a weak form of the Ramanujan property.

Citations (422)

View on Semantic Scholar

Summary

The paper demonstrates that community detection becomes achievable when SBM parameters exceed a critical threshold, confirming Decelle et al.'s conjecture.
It introduces a modified adjacency matrix using self-avoiding paths that exhibits a spectral separation similar to the weak Ramanujan property.
The research refines spectral clustering techniques and reveals phase transitions in sparse graphs, paving the way for broader applications in graph theory.

Community Detection Thresholds and the Weak Ramanujan Property

The paper by Laurent Massouli tackles the open problem related to community detection in random graph models, specifically addressing the conjecture by Decelle et al. regarding the stochastic block model (SBM). In the stochastic block model, nodes are divided into communities, and the challenge lies in uncovering these hidden communities based on observed interactions, represented as edges in a graph. The conjecture posited a sharp threshold for community reconstruction's feasibility, which had been partially resolved concerning negative results by Mossel, Neeman, and Sly, while positive results were only known above the conjectured threshold.

Key Contributions and Methodology

The significant contribution of this work is the resolution of the positive aspect of the conjecture, demonstrating that community detection is possible when parameters exceed the threshold. The methodology introduces a modified adjacency matrix, $B$ , built using self-avoiding paths of a logarithmic length, $\ell$ , between node pairs. This approach entails constructing a matrix, $B$ , which encapsulates neighborhood expansion properties. The matrix $B$ is pivotal as it secures a spectral separation analogous to the Ramanujan property, which is crucial for effective spectral clustering techniques.

Weak Ramanujan Property:

The paper establishes a weak Ramanujan property for matrix $B$ ; specifically, the spectrum of $B$ includes two leading eigenvalues, while the remaining eigenvalues are of lower magnitude, scaling as $O(n^{\epsilon}\sqrt{\rho(B)})$ for any $\epsilon > 0$ , where $\rho(B)$ denotes the spectral radius.

Results and Impact

The results affirm that the community structure can be reconstructed given model parameters exceed the threshold $\tau=(a-b)^2/[2(a+b)] > 1$ . The leading eigenvectors of the modified matrix $B$ facilitate non-trivial reconstruction of the underlying graph structure, corroborating Decelle et al.'s conjecture positively. This outcome is critical in understanding phase transitions in community detection for sparse graphs, elucidating when meaningful signal extraction from graph observations is plausible.

Theoretical Implications and Future Directions

Theoretical implications of this work extend to the wider applicability of the path-expansion technique in spectral clustering, potentially adapting to broader classes of graphs and models beyond SBM. It suggests paths toward strengthening spectral regularization methods by integrating properties achieving a Ramanujan-like spectral separation. The findings encourage further exploration of spectral techniques, especially those encompassing higher-order network features captured by self-avoiding paths.

Speculative Directions:

Extending to labeled SBMs, which introduces additional complexities and variations.
Examining different methods of adjacency matrix regularization.
Investigating how these theories apply to other random graph models with diverse interaction probabilities.

In conclusion, this paper advances the field of theoretical computer science by resolving a pivotal conjecture about community detection thresholds and provides a robust framework supported by spectral graph theory that could spearhead future breakthroughs in graph partitioning and clustering tasks.