Certified Signed Graph Unlearning (CSGU)
- The paper introduces CSGU, which achieves certified unlearning in SGNNs through triadic influence neighborhoods, sociological weighting, and DP-calibrated updates.
- CSGU leverages balance and status theories to quantify sociological influence, ensuring that model utility is maintained after data deletion.
- Empirical results on multiple datasets show CSGU outperforms baselines with up to +13.9% Macro-F1 improvement and reduced membership-inference risks.
Certified Signed Graph Unlearning (CSGU) is a framework for provably private, semantics‐preserving removal of user‐specified data from Signed Graph Neural Networks (SGNNs). By harnessing sociological properties of signed network data and formally grounded differential privacy mechanisms, CSGU provides strong theoretical and empirical guarantees that the influence of deleted edges, nodes, or features on the trained model is effectively and efficiently erased while preserving model utility. It is the first method to address the unique challenges of unlearning in signed (as opposed to unsigned) graph settings, where sign structure carries critical information for model behavior (Zhao et al., 18 Nov 2025).
1. Background and Foundations
A signed graph models systems with both positive () and negative () edges. Node features are assembled into a signed adjacency matrix with entries , , or $0$. Unlike conventional GNNs, which assume homophily and only positive connections, SGNNs incorporate both positive and negative relationships, grounding message passing in sociological theories:
- Balance theory: Governs triadic consistency, implying patterns such as “the friend of my friend is my friend.”
- Status theory: Assigns implicit status levels, orienting edges as status-increasing (positive) or status-decreasing (negative).
A generic SGNN layer updates node representations by sign-aware aggregation over positive and negative neighborhoods:
Graph unlearning seeks to produce a new parameter set that is statistically indistinguishable from retraining on a graph with deletions applied. Certified unlearning tightens this by requiring that the outputs of the unlearned model are -indistinguishable (in a differential privacy sense) from a clean retrain.
2. CSGU Architecture and Sequential Phases
CSGU is structured as a three-stage procedure:
| Phase ID | Name | Core Operation |
|---|---|---|
| I | Triadic Influence Neighborhood | Identify minimal region of correlated influence via triadic closures |
| II | Sociological Influence Quantification | Assign weights based on balance and status centralities |
| III | Weighted Certified Unlearning | Execute parameter update, calibrate with DP noise, finalize new model |
Phase I: Triadic Influence Neighborhood (TIN). Instead of naïve -hop expansions, TIN iteratively expands the set of affected edges via triadic closures, targeting all edges whose gradient directions are not orthogonal to those for the deleted set. This leverages the role of triangles in balance theory. TIN typically converges in $2$–$4$ steps in sparse graphs and produces certification regions of size , where is average triangle participation, offering a substantial improvement in efficiency and specificity over hop-based methods.
Phase II: Sociological Influence Quantification (SIQ). SIQ assigns a real-valued weight to each affected edge, reflecting its sociological importance:
- Balance centrality: For node , the fraction of incident triangles that are balanced.
- Status centrality: For node , an aggregated score reflecting hierarchical standing, weighted by edge signs and degree. These are unified, normalized, and aggregated into edge weights by softmax and averaging.
Phase III: Weighted Certified Unlearning (WCU). The unlearning update is computed as a (weighted) influence-function-based first-order Taylor expansion about :
The sensitivity of this update is assessed and calibrated for -DP using the Gaussian mechanism, ensuring that the unlearned model parameters are privacy-preserving and close in distribution to a true retrained model.
3. Algorithmic and Mathematical Details
Phase I: Triadic Influence Neighborhood
The certification region is constructed as the minimal set such that, for , the loss gradient for that edge is orthogonal to those of the deletion set. Triadic closure is defined such that and are added to if exists in . Iterative expansion continues until no new qualifying edges can be added.
Phase II: Influence Quantification
Let be the triangles involving . For each node:
- , with the balance indicator $\mathcal B(T_{ijk})=\mathbbm1[(A^s_{ij} A^s_{jk} A^s_{ki}=+1)]$
- , with as sigmoid and the mean degree
- Unified influence: (with normalization )
- Node weights are softmaxed; edge weights are the min of the average node weights and 1.
Phase III: Certified Update and DP Calibration
Weighted binary cross-entropy is minimized over edges, with the differential Taylor shift (Hessian and gradient computed over ). Sensitivity is calculated per edge as (for -strong convexity), and the DP noise variance is set using
with . The final parameters are with .
4. Theoretical Guarantees
The certification region and weighted update guarantee -certified unlearning under strong convexity and triadic completeness:
Expected utility degradation compared to retraining is bounded as:
These results formalize statistical indistinguishability and scalability with respect to model dimension, privacy budget, and the size of the affected region.
5. Computational Complexity and Implementation Considerations
| Complexity Type | Bound |
|---|---|
| Time | (: avg triangles; : max degree; : param dim) |
| Space | (edges in region + Hessian) |
Hessian inversion, dominant at , is accelerated by conjugate gradients (). Practical considerations include precomputing triangle indices, using -regularization to ensure strong convexity (), and tuning for balance/status centrality.
6. Empirical Evaluation
CSGU was validated on Bitcoin-Alpha, Bitcoin-OTC, Epinions, and Slashdot signed graph datasets (3k–33k nodes), using SGCN, SNEA, SDGNN, and SiGAT backbones. Baselines included Retrain, GraphEraser, GNNDelete, GIF, and IDEA. Key findings:
- Utility: CSGU attains up to +13.9% Macro-F1 compared to best baseline.
- Privacy: Membership-inference AUC reduced up to –10.9% over best comparator.
- Efficiency: Per-unlearning latency is sub-second to a few seconds, far outperforming retraining or GraphEraser.
Ablation studies confirmed that each stage (TIN, SIQ, DP loss, noise injection) is essential. Notably, CSGU maintains robust performance for negative edge deletions, outperforming naïve methods. Varying deletion ratios (0.5–5%) and privacy budgets () yielded stable trade-offs.
CSGU generalizes to unsigned graphs by substituting degree centrality for sociological weights, maintaining or exceeding baseline performance in utility and privacy at similar runtime.
7. Limitations and Research Directions
CSGU relies on local strong convexity (regularized SGNNs) and the presence of sufficient triangular structure; its efficiency and guarantees degrade for extremely low-triangle graphs. The global DP budget accumulates with sequential applications, though advanced composition could mitigate this overhead. Limitations include:
- Necessity of triangular motifs for TIN expansion
- Assumed convexity around
- Global DP budget growth under repeated unlearning
Proposed extensions include adaptive convexity handling (higher-order influence to enable non-convex regions), support for dynamic and streaming graphs, modeling with longer balanced motifs, and integration with federated SGNN unlearning for distributed privacy protection.
CSGU inaugurates certified unlearning for the sociologically rich domain of signed graphs, synthesizing balance/status theory with formal privacy to provide both practical and theoretical assurances (Zhao et al., 18 Nov 2025).