Certified Signed Graph Unlearning (CSGU)

Updated 25 November 2025

The paper introduces CSGU, which achieves certified unlearning in SGNNs through triadic influence neighborhoods, sociological weighting, and DP-calibrated updates.
CSGU leverages balance and status theories to quantify sociological influence, ensuring that model utility is maintained after data deletion.
Empirical results on multiple datasets show CSGU outperforms baselines with up to +13.9% Macro-F1 improvement and reduced membership-inference risks.

Certified Signed Graph Unlearning (CSGU) is a framework for provably private, semantics‐preserving removal of user‐specified data from Signed Graph Neural Networks (SGNNs). By harnessing sociological properties of signed network data and formally grounded differential privacy mechanisms, CSGU provides strong theoretical and empirical guarantees that the influence of deleted edges, nodes, or features on the trained model is effectively and efficiently erased while preserving model utility. It is the first method to address the unique challenges of unlearning in signed (as opposed to unsigned) graph settings, where sign structure carries critical information for model behavior (Zhao et al., 18 Nov 2025).

1. Background and Foundations

A signed graph $\mathcal{G}=(\mathcal V,\mathcal E^+,\mathcal E^-,\mathbf X)$ models systems with both positive ( $\mathcal E^+$ ) and negative ( $\mathcal E^-$ ) edges. Node features $\mathbf x_v \in \mathbb R^{d_f}$ are assembled into a signed adjacency matrix $\mathbf A^s$ with entries $+1$ , $-1$ , or $0$. Unlike conventional GNNs, which assume homophily and only positive connections, SGNNs incorporate both positive and negative relationships, grounding message passing in sociological theories:

Balance theory: Governs triadic consistency, implying patterns such as “the friend of my friend is my friend.”
Status theory: Assigns implicit status levels, orienting edges as status-increasing (positive) or status-decreasing (negative).

A generic SGNN layer updates node representations by sign-aware aggregation over positive and negative neighborhoods:

$\mathbf h_u^{(l)} = \mathrm{UPD}\Bigl(\mathrm{AGG}\{\mathrm{MSG}(\mathbf h_u^{(l-1)}, \mathbf h_v^{(l-1)}, \mathbf A^s_{uv}) \mid v \in \mathcal N_u^+\}, \{\ldots \mid v \in \mathcal N_u^-\}\Bigr).$

Graph unlearning seeks to produce a new parameter set $\theta'$ that is statistically indistinguishable from retraining on a graph with deletions $\mathcal E_d$ applied. Certified unlearning tightens this by requiring that the outputs of the unlearned model are $(\epsilon, \delta)$ -indistinguishable (in a differential privacy sense) from a clean retrain.

2. CSGU Architecture and Sequential Phases

CSGU is structured as a three-stage procedure:

Phase ID	Name	Core Operation
I	Triadic Influence Neighborhood	Identify minimal region of correlated influence via triadic closures
II	Sociological Influence Quantification	Assign weights based on balance and status centralities
III	Weighted Certified Unlearning	Execute parameter update, calibrate with DP noise, finalize new model

Phase I: Triadic Influence Neighborhood (TIN). Instead of naïve $k$ -hop expansions, TIN iteratively expands the set of affected edges via triadic closures, targeting all edges whose gradient directions are not orthogonal to those for the deleted set. This leverages the role of triangles in balance theory. TIN typically converges in $2$–$4$ steps in sparse graphs and produces certification regions of size $O(|\mathcal E_d| T)$ , where $T$ is average triangle participation, offering a substantial improvement in efficiency and specificity over hop-based methods.

Phase II: Sociological Influence Quantification (SIQ). SIQ assigns a real-valued weight $w_{uv}\in [0,1]$ to each affected edge, reflecting its sociological importance:

Balance centrality: For node $v$ , the fraction of incident triangles that are balanced.
Status centrality: For node $v$ , an aggregated score reflecting hierarchical standing, weighted by edge signs and degree. These are unified, normalized, and aggregated into edge weights by softmax and averaging.

Phase III: Weighted Certified Unlearning (WCU). The unlearning update is computed as a (weighted) influence-function-based first-order Taylor expansion about $\theta^*$ :

$\Delta\theta = -H^{-1}g, \quad g = \nabla_\theta\mathcal L(\theta^*; \mathcal E, \mathbf W) - \nabla_\theta\mathcal L(\theta^*; \mathcal E_r, \mathbf W)$

The sensitivity of this update is assessed and calibrated for $(\epsilon,\delta)$ -DP using the Gaussian mechanism, ensuring that the unlearned model parameters $\tilde\theta$ are privacy-preserving and close in distribution to a true retrained model.

3. Algorithmic and Mathematical Details

Phase I: Triadic Influence Neighborhood

The certification region $\mathcal R$ is constructed as the minimal set such that, for $(u,v)\not\in\mathcal R$ , the loss gradient for that edge is orthogonal to those of the deletion set. Triadic closure is defined such that $e=(u,v)$ and $e'=(v,w)$ are added to $\mathcal R$ if $(u,w)$ exists in $\mathcal E$ . Iterative expansion continues until no new qualifying edges can be added.

Phase II: Influence Quantification

Let $\mathcal T(v)$ be the triangles involving $v$ . For each node:

$\mathcal I_{\rm bal}(v) = \frac{1}{|\mathcal T(v)|} \sum_{T \in \mathcal T(v)} \mathcal B(T)$ , with the balance indicator $\mathcal B(T_{ijk})=\mathbbm1[(A^s_{ij} A^s_{jk} A^s_{ki}=+1)]$
$\mathcal I_{\rm sta}(v) = \frac{1}{\sqrt{|\mathcal N(v)|}} \sum_{u\in\mathcal N(v)} A^s_{uv} \sigma\left(\frac{\deg(u)}{\bar d}\right)$ , with $\sigma$ as sigmoid and $\bar d$ the mean degree
Unified influence: $\mathcal I_{\rm uni}(v) = \alpha\phi(\mathcal I_{\rm bal}(v)) +(1-\alpha)\phi(|\mathcal I_{\rm sta}(v)|)$ (with normalization $\phi$ )
Node weights are softmaxed; edge weights $w_{uv}$ are the min of the average node weights and 1.

Phase III: Certified Update and DP Calibration

Weighted binary cross-entropy is minimized over edges, with the differential Taylor shift $-H^{-1}g$ (Hessian $H$ and gradient $g$ computed over $\mathcal R$ ). Sensitivity is calculated per edge as $w_{uv}\|\mathbf h_{uv}\|_2/\lambda$ (for $\lambda$ -strong convexity), and the DP noise variance $\sigma$ is set using

$\sigma = \frac{\sqrt{2\ln(1.25/\delta)}\Delta_s}{\epsilon}$

with $\Delta_s = \max_{(u,v)\in\mathcal E_d} w_{uv}\|\mathbf h_{uv}\|_2/\lambda$ . The final parameters are $\tilde\theta=\theta^*+\Delta\theta+\xi$ with $\xi\sim \mathcal N(0,\sigma^2I)$ .

4. Theoretical Guarantees

The certification region and weighted update guarantee $(\epsilon,\delta)$ -certified unlearning under strong convexity and triadic completeness:

$\Pr[\tilde\theta\in S] \le e^\epsilon \Pr[\theta^*_r\in S] + \delta,\quad\forall S\subseteq\Theta$

Expected utility degradation compared to retraining is bounded as:

$\mathbb E\|\tilde\theta-\theta^*_r\|_2^2 \le \frac{2d\ln(1.25/\delta)\Delta_s^2}{\epsilon^2}+O\left(\frac{|\mathcal R|^2}{\lambda^2|\mathcal E|^2}\right)$

These results formalize statistical indistinguishability and scalability with respect to model dimension, privacy budget, and the size of the affected region.

5. Computational Complexity and Implementation Considerations

Complexity Type	Bound
Time	$O(\|\mathcal E_d\| T\Delta + d^2 + d^3)$ ( $T$ : avg triangles; $\Delta$ : max degree; $d$ : param dim)
Space	$O(\|\mathcal R\| + d^2)$ (edges in region + Hessian)

Hessian inversion, dominant at $O(d^3)$ , is accelerated by conjugate gradients ( $O(d^2k_{\rm CG})$ ). Practical considerations include precomputing triangle indices, using $\ell_2$ -regularization to ensure strong convexity ( $\lambda$ ), and tuning $\alpha \in [0.4,0.8]$ for balance/status centrality.

6. Empirical Evaluation

CSGU was validated on Bitcoin-Alpha, Bitcoin-OTC, Epinions, and Slashdot signed graph datasets (3k–33k nodes), using SGCN, SNEA, SDGNN, and SiGAT backbones. Baselines included Retrain, GraphEraser, GNNDelete, GIF, and IDEA. Key findings:

Utility: CSGU attains up to +13.9% Macro-F1 compared to best baseline.
Privacy: Membership-inference AUC reduced up to –10.9% over best comparator.
Efficiency: Per-unlearning latency is sub-second to a few seconds, far outperforming retraining or GraphEraser.

Ablation studies confirmed that each stage (TIN, SIQ, DP loss, noise injection) is essential. Notably, CSGU maintains robust performance for negative edge deletions, outperforming naïve methods. Varying deletion ratios (0.5–5%) and privacy budgets ( $\epsilon\in\{0.1,0.5,1\}$ ) yielded stable trade-offs.

CSGU generalizes to unsigned graphs by substituting degree centrality for sociological weights, maintaining or exceeding baseline performance in utility and privacy at similar runtime.

7. Limitations and Research Directions

CSGU relies on local strong convexity (regularized SGNNs) and the presence of sufficient triangular structure; its efficiency and guarantees degrade for extremely low-triangle graphs. The global DP budget accumulates with sequential applications, though advanced composition could mitigate this overhead. Limitations include:

Necessity of triangular motifs for TIN expansion
Assumed convexity around $\theta^*$
Global DP budget growth under repeated unlearning

Proposed extensions include adaptive convexity handling (higher-order influence to enable non-convex regions), support for dynamic and streaming graphs, modeling with longer balanced motifs, and integration with federated SGNN unlearning for distributed privacy protection.

CSGU inaugurates certified unlearning for the sociologically rich domain of signed graphs, synthesizing balance/status theory with formal privacy to provide both practical and theoretical assurances (Zhao et al., 18 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

Certified Signed Graph Unlearning (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Certified Signed Graph Unlearning (CSGU).