3D LPA: Connectivity-Aware Label Propagation
- The paper introduces a connectivity-aware label propagation mechanism by integrating a neighborhood-strength metric to improve label selection and community detection quality.
- It employs asynchronous updates and an active-node management strategy, drastically reducing redundant computations and iteration counts compared to standard LPA.
- Empirical evaluations on synthetic and real networks show enhanced modularity and scalability, particularly in settings with high local clustering.
Connectivity-Aware Label Propagation Algorithm (3D LPA) delivers a community detection architecture that incorporates local neighborhood connectivity into the label update criterion, improving both computational efficiency and partition quality compared to the standard Label Propagation Algorithm (LPA). It achieves nearly-linear runtime via asynchronous updates and active-node management, and bolsters accuracy—particularly in networks characterized by high local clustering—by privileging neighbors tightly integrated within a node’s immediate environment (Xie et al., 2011).
1. Neighborhood-Strength Measure
In the standard LPA, each node updates its label by selecting the most frequently occurring label among its neighbors . The 3D LPA introduces a neighborhood-strength metric to quantify, for each neighbor , the extent to which connects to the remainder of 's local environment. This is formalized by:
where is the edge set. Thus, counts the number of edges from to other neighbors of (excluding itself). For each candidate label present in , the subset of neighbors carrying label receives a neighborhood-strength-driven score:
where is a tunable parameter. At , 3D LPA reverts to standard LPA; at , each intra-neighborhood link receives equal weighting. The score thus fuses label frequency with intra-neighborhood cohesiveness.
2. Generalized Label Update Rule
Instead of the majority-vote mechanism, 3D LPA enacts the following asynchronous update for each selected node :
- For each candidate label in , identify the set .
- Compute as above for each candidate .
- Set
breaking ties randomly.
This refinement weights influence so that neighbors more densely connected within ’s immediate vicinity exert proportionally greater effect. As increases, structurally central neighbors are systematically preferred over sparsely connected ones for label propagation.
3. Algorithmic Workflow with Speed-Up Optimizations
3D LPA integrates two principal enhancements:
- The connectivity-weighted label selection as described.
- An active-node list that manages which nodes need possible label reevaluation, yielding substantial efficiency gains.
Initialization:
- Assign each node a unique label: unique identifier.
- ActiveList all nodes.
Iterative Update Loop:
- Randomly select and remove node from ActiveList.
- For all , compute .
- Partition into for each observed label .
- Compute for all .
- Set label with maximal (breaking ties randomly).
- If changes, propagate status updates:
- For and each neighbor , determine node status:
- Interior: all neighbors share its label.
- Passive: node would not change label upon update.
- Active: otherwise.
- Update ActiveList to reflect status changes; add newly active, remove newly passive.
- For and each neighbor , determine node status:
Optimization remarks:
- Naïve computation scales as , but adjacency bitsets and incremental histograms reduce this to (or amortized per link).
- The active-node bookkeeping ensures each update is effective (i.e., results in a label change), minimizing redundant operations.
4. Computational Complexity Analysis
Let denote the number of nodes, the number of edges in a sparse graph ().
Original LPA:
- Each node scan: .
- Empirically, the number of iterations per node, , ranges $1$–$20$.
- Total complexity: .
3D LPA:
- Effective update cost: using direct computation, reduced to by histogram/bitset optimizations.
- Number of iterations matches the count of label changes (passive/interior nodes excluded), with observed empirically—representing $6$– fewer iterations than original LPA.
- Overall runtime per pass: , but with significantly diminished constant and pass count.
Both algorithms maintain nominal linear scaling, but 3D LPA’s strictly “effective” updates deliver reduced running time and improved scalability for larger networks.
5. Empirical Validation on Synthetic and Real Networks
3D LPA was evaluated on:
- Synthetic benchmarks: LFR benchmarks (, , variable mixing ).
- Metrics: Normalized Mutual Information (NMI), Adjusted Rand Index (ARI) against ground truth.
- For , all values yield similar results.
- For up to $0.65$, variants outperform by $5$– NMI/ARI, and maintain partition integrity where standard LPA fails.
- Real-world networks: Nine social graphs (size range $34$–$10,680$ nodes; datasets include karate, football, netscience, email, PGP).
- Metric: modularity , with $100$ runs per dataset.
- Highest : (often ) surpasses by up to $0.04$ absolute.
- Mean : is consistently higher and more stable, fewer all-in-one collapses ().
- Iteration reduction: – across diverse networks.
Improvements are most pronounced when clustering coefficients are high, as the algorithm’s weighting rewards neighbors that form cohesive local structures.
6. Context and Interpretation
3D LPA augments the parameter-free, asynchronous update framework of standard label propagation by embedding the neighborhood-strength score for each candidate label. The integration of connectivity-aware scoring and efficient bookkeeping produces accelerated convergence and reinforced community boundaries within topologies abundant in local triangles. This suggests that the method is particularly suited for network domains—social, biological, information—where clustering and intra-community density are signature features. A plausible implication is that the approach is robust against “collapse” into trivial partitions in the presence of strong local structure, supporting its utility in large-scale, real-world network analysis (Xie et al., 2011).