Certified Unlearning in Decentralized Federated Learning
- The paper introduces a certified unlearning framework that leverages Newton-style corrective updates to achieve (ε,δ)-indistinguishability between models with and without deleted client data.
- It employs second-order approximations and Fisher information to compute corrections efficiently without full retraining, reducing computational overhead.
- Privacy guarantees are enforced using calibrated Gaussian noise and network-wide propagation, achieving near-retraining accuracy while being ≈97% faster.
A certified unlearning framework for decentralized federated learning (DFL) formally guarantees that, after a client’s data or updates are deleted per a “right to be forgotten” request, the resulting model is provably (ε,δ)-indistinguishable from retraining the DFL system from scratch without the deleted data. Such certification must address the propagation of client influence through networked, peer-to-peer training—a scenario fundamentally more challenging than centralized or server-coordinated FL due to the fully decentralized communication topology and the mixing of information across clients.
1. DFL System Model and Influence Propagation
In decentralized federated learning, clients are nodes in an undirected communication graph . Model parameters are stored locally at each node and updates are exchanged only with immediate neighbors. Training proceeds via decentralized SGD (DSGD):
- Each client with local dataset (size ) samples and computes stochastic gradient .
- Local models are averaged according to a symmetric, doubly stochastic mixing matrix (adapted to ), inducing information diffusion.
- Update: .
After rounds, the aggregation of all local models serves as the global model. Lemma 1 asserts that after sufficient iterations, every client’s information is mixed into all others with approximately equal weight: for spectral parameter (Wu et al., 10 Jan 2026).
2. Formal Definition of Certified Unlearning in DFL
Certified unlearning in DFL requires that, after removing a subset from client , the output model is statistically indistinguishable from the model retrained from scratch on . The certification is based on -indistinguishability:
where is the DFL training operator and denotes relevant auxiliary state. This criterion directly generalizes the standard definitions used in (centralized) certified machine unlearning (Wu et al., 10 Jan 2026).
3. Newton-Style Corrective Updates and Fisher Approximation
The core unlearning mechanism computes a certified correction using a second-order (Newton-style) influence function that locally inverts the effect of the deleted samples:
- The exact retrained optimum solves , where is the new post-deletion empirical risk.
- To avoid full retraining, a Taylor expansion approximates around :
with Hessian estimate .
- For scalability, is approximated by the empirical Fisher information:
which matches the Hessian at the empirical minimizer for log-likelihood losses and reduces storage from to .
4. Privacy Guarantees via Gaussian Mechanism and Network Noise Propagation
To provide a formal -certificate, the correction is perturbed using calibrated Gaussian noise (mirroring the approach of differential privacy for adjacent datasets):
- Sensitivity . For all clients, .
- Each correction is independently perturbed: with
- The noisy correction is broadcast across the network and each client updates:
- One optional post-unlearning round of DSGD on retained data can be performed; by the DP post-processing theorem, the guarantee remains valid.
By Lemmas 2–3 and Theorem 1, this procedure yields a certified unlearning guarantee; i.e., the output distribution of the unlearning operation is close (in -DP sense) to that of retraining after data deletion (Wu et al., 10 Jan 2026).
5. Algorithmic Workflow and Complexity
A typical certified unlearning episode in DFL comprises:
- The client requesting deletion computes the correction using the local Hessian/Fisher (on existing retained data).
- The client adds Gaussian noise, sends the correction to neighbors.
- The correction is disseminated via network flooding/gossip, ensuring each client receives the correction once.
- All clients apply the update, and if desired, a single fine-tuning round.
- No communication with the deleted client is required after the initial request.
The storage overhead is per deletion for a full Hessian, or using Fisher approximations. Communication cost is a single network-wide broadcast of the correction vector. Thus, compared to retraining (which requires rounds), the certified unlearning protocol completes in communication rounds plus local computation (Wu et al., 10 Jan 2026).
6. Theoretical Utility and Privacy Bounds
The certified unlearning protocol delivers explicit utility and privacy bounds:
- For all , the post-unlearning model satisfies .
- The Newton-based surrogate achieves error .
- The overall generalization bound for global minimizer is:
All bounds scale favorably when is small and the network mixing is rapid.
7. Empirical Validation and Network Scalability
The certified DFL unlearning framework is empirically validated on image (CIFAR-10/ResNet-18) and tabular (MNIST/logistic regression) benchmarks with both ring and Erdős–Rényi topologies and varying degrees of non-IIDness:
- Post-unlearning accuracy is within – of retraining for all deletion modalities (sample-, class-, client-wise).
- Membership inference attack precision/recall drops to random guessing (50%), consistent with full deletion.
- Unlearning is ≈97% faster than retraining, with a single correction round replacing hundreds of retraining rounds in naive PDUDT approaches.
The protocol is robust to network structure, showcases scalable, efficient removal guarantees, and achieves formal -unlearning certification (Wu et al., 10 Jan 2026).
The certified unlearning framework for DFL rigorously integrates influence quantification, second-order correction, scalable Hessian statistics, and formal privacy analysis adapted to decentralized architectures. It achieves provable guarantees under minimal network assumptions and demonstrates practical, efficient, and robust performance, confirming its suitability for RTBF compliance in peer-to-peer federated learning.