HW-GNN: Homophily-Aware Spectral Bot Detection
- HW-GNN is a homophily-aware graph spectral framework that uses learnable Gaussian-window filters to capture localized spectral anomalies for bot detection.
- It employs trainable spectral filters calibrated by homophily ratios, achieving notable macro F1 score improvements on diverse social graph benchmarks.
- The design offers plug-in compatibility with existing GNN models while providing actionable insights for detecting structural irregularities in network data.
HW-GNN refers here to a homophily-aware, Gaussian-window constrained graph spectral network specifically designed for social network bot detection. It represents a significant departure from broad-spectrum graph spectral methods by introducing an adaptive, domain-knowledge guided spectral filter mechanism that tightly aligns with graph homophily characteristics and spectral energy distributions.
1. Conceptual Foundations and Motivations
HW-GNN is oriented toward the structural detection of social bots on large-scale graphs using the spectral-domain properties of Graph Neural Networks (GNNs). Traditional spectral GNNs typically employ broad polynomial filters, which are insufficiently sensitive to localized, bot-related spectral anomalies due to their inability to focus sharply on relevant frequency bands. Moreover, graph homophily—a key measure of node label similarity among neighbors—is empirically tied to the distribution of spectral energy, with low homophily graph structures often revealing high-frequency anomalies characteristic of bot clusters. HW-GNN addresses these two weaknesses by integrating learnable, localized spectral filters and explicitly encoding the homophily–frequency relationship into filter parameterization and regularization (Liu et al., 27 Nov 2025).
2. Gaussian-Window Constrained Spectral Filtering
The network replaces coarse polynomial filtering with a composition of S learnable Gaussian windows in the spectral domain. For graph Laplacian L (normalized, with spectrum λ ∈ [0,2]), HW-GNN defines S Gaussian windows: where centers ω_s and bandwidths σ_s are trainable parameters. Each windowed spectral filter g_s(λ) = G_s(λ) is realized via polynomial approximation (e.g., Chebyshev, Bernstein, Jacobi), yielding coefficients
such that the filter matrix applied to input features X is
with learnable weight matrices W_s. Attention weights (softmax over logit parameters β_s) fuse the filtered outputs: This construction enables localized spectral focus, directly addressing structural anomalies induced by bots.
3. Homophily-Aware Adaptation Mechanism
The homophily ratio h is computed as
where y_u ∈ {0,1} denotes genuine and bot labels. HW-GNN translates h into a target center frequency
mapping low homophily (high heterophily) toward high-frequency spectral focus. Two lightweight multilayer perceptrons parameterize the S window centers and bandwidths as functions of and window index s: Training incorporates a spectral-distribution regularizer enforcing proximity of the learned window centers to : where C is the number of HW-GNN blocks.
This suggests HW-GNN injects class-structural priors directly into the spectral representation, maximizing the discriminative utility of localized graph frequencies.
4. Model Architecture and Training Flow
HW-GNN stacks C identical blocks (layers), each integrating S windowed spectral convolutions with attention fusion, residual skip connections, and nonlinearities:
After C layers, final node logits are produced by an MLP or linear head.
The objective function combines focal loss—mitigating class imbalance in bot detection—with spectral regularization: The focal loss is: where hyperparameters α_i and γ tune class weighting and difficulty.
Efficient layerwise polynomial approximation and parameter updates enable HW-GNN to maintain plug-in compatibility with prior spectral GNNs.
5. Implementation Details and Pseudocode
Training proceeds by (1) computing h and , (2) initializing window parameters, (3) iteratively updating window parameters via MLPs and polynomial coefficients per epoch, (4) applying filtering, fusion, and residual propagation across C blocks, (5) optimizing classification and frequency distribution regularization objectives by backpropagation.
Pseudocode excerpt:
1 2 3 4 5 6 7 8 9 10 11 12 |
for epoch in range(MaxEpoch): for batch in data: h = compute_homophily(batch.labels, batch.edges) omega_bar = 2 * (1 - h) for layer in range(C): omega_s, sigma_s = MLP_omega(omega_bar, s), MLP_sigma(omega_bar, s) c_sk = integral(lambda: exp(-(lambda-omega_s)**2/(2*sigma_s**2)) * Pk(lambda), 0, 2) Z_s = sum(c_sk * Pk(L) * H[layer] * W_s) w_s = softmax(beta_s) H[layer+1] = activation(H[layer] + sum(w_s * Z_s)) loss = focal_loss(H[C], labels) + lambda_f * freq_loss(omega_hat, omega_bar) update_parameters(loss) |
6. Empirical Evaluation
HW-GNN was benchmarked on five social graph datasets (TwiBot-20, TwiBot-22, MGTAB, T-Social, T-Finance) covering various scales and homophily profiles. Evaluated against 15+ baselines spanning MLPs, spatial GNNs, and spectral GNNs, HW-GNN achieved state-of-the-art macro F1 scores with an average improvement of 4.3%:
- TwiBot-20: 91.51 vs 89.89 (+1.6)
- TwiBot-22: 61.95 vs 59.42 (+2.5)
- MGTAB: 89.37 vs 88.53 (+0.8)
- T-Social: 94.39 vs 83.98 (+10.4)
- T-Finance: 91.95 vs 87.63 (+4.3)
Ablation studies confirm the critical contribution of Gaussian windows, homophily-guided adaptation, and multi-window design: omission of any results in F1 drops of 1–10%. Sensitivity analyses identify stable optima at S ∈ [4, 6], K ∈ [3, 5], and λ_f approximately 0.2–0.5.
7. Applicability, Limitations, and Perspectives
HW-GNN directly addresses graph spectral analysis requirements in bot detection scenarios characterized by class heterogeneity and localized structural anomalies. Immediate plug-in compatibility with standard spectral GNNs assures rapid adoption. However, the paradigm presupposes that homophily–frequency relationships are a primary driver of spectral discriminability; in domains lacking this linkage, adaptations may be required.
A plausible implication is that future research will generalize homophily-guided spectral adaptation to broader relational learning tasks beyond bot detection and examine alternative filter window parameterizations. The Gaussian windowing approach provides a flexible, differentiable mechanism that could interface with multi-scale and attention-based GNN architectures for improved structural anomaly sensitivity.