Centralized Unlearning Methods

Updated 30 January 2026

Centralized unlearning is a method that removes specific data points from ML models, ensuring the output is statistically equivalent to a model retrained on the remaining data.
It employs strategies like full retraining, SISA partitioning, coded, and ticketed approaches to efficiently and exactly erase data influence with strong theoretical guarantees.
The approach balances computational cost, memory usage, and utility preservation while meeting privacy requirements such as the right to be forgotten.

Centralized unlearning refers to the class of methods and frameworks designed to remove the influence of specific data points or subsets from machine learning models trained in a centrally coordinated setting. The objective is to render the model statistically indistinguishable from one trained on the retained data only, aligning with privacy requirements such as the right to be forgotten. Centralized approaches contrast with distributed/federated unlearning; all operations and guarantees are coordinated by a central algorithm or server, enabling strong theoretical guarantees and practical performance trade-offs relevant to large-scale machine learning deployments.

1. Formal Frameworks and Guarantees

In the centralized paradigm, let $D \subseteq \mathcal{Z}^n$ denote the full training set, $D_e \subseteq D$ the erased subset, and $D_r = D \setminus D_e$ the retained data. For learning algorithm $\mathcal{A} : \mathcal{Z}^* \to \mathcal{H}$ and hypothesis (model parameter) space $\mathcal{H}$ , a centralized unlearning process $\mathcal{U}(D, D_e, M)$ (where $M = \mathcal{A}(D)$ ) must output $M_{\mathrm{unlearn}}$ such that:

$M_{\mathrm{unlearn}} \approx \mathcal{A}(D_r)$

Exact Unlearning

Exact unlearning requires statistical indistinguishability:

$\Pr[\mathcal{U}(D, D_e, \mathcal{A}(D)) \in S] = \Pr[\mathcal{A}(D_r) \in S]$

for every measurable $S \subseteq \mathcal{H}$ . This reflects perfect removal as if retrained from scratch on $D_r$ (Wang et al., 2024).

Approximate Unlearning

Approximate (ε-) unlearning relaxes this to an $\varepsilon$ -indistinguishability guarantee, typically formulated as:

$e^{-\varepsilon} \leq \frac{\Pr[\mathcal{U}(D, \{z\}, \mathcal{A}(D)) \in S]}{\Pr[\mathcal{A}(D \setminus \{z\}) \in S]} \leq e^{\varepsilon}$

which can be extended to $(\varepsilon, \delta)$ guarantees in analogy to differential privacy (Wang et al., 2024).

2. Algorithmic Taxonomy

Exact Methods

Full Retraining: The baseline, running $\mathcal{A}$ on $D_r$ ; cost is proportional to the original dataset size (Li et al., 2024).
Data Partitioning / SISA (Sharded, Isolated, Sliced, Aggregated): $D$ is split into $k$ shards $D^{(i)}$ , each with submodel $M^{(i)}$ . Erasure in $D_e \subseteq D^{(j)}$ triggers retraining only of the affected submodel, greatly reducing cost (Wang et al., 2024, Li et al., 2024).
Mergeable Encodings / Ticketed Unlearning: Compress $D$ into small mergeable encodings (central state) and issue per-sample tickets, enabling exact recovery of the model on $D_r$ . Efficiency depends on concept class compressibility (Ghazi et al., 2023).
Coded Unlearning: Use linear encodings to mix shards before training weak learners. On data deletion, update only affected coded shards and retrain corresponding learners, achieving perfect unlearning at minimal retrain cost (Aldaghri et al., 2020).

Approximate Methods

Certified Data Removal (Newton Step): Remove influence using one-step Newton updates, adding noise/calibrated perturbation for $\varepsilon$ -certified removal (Wang et al., 2024, Li et al., 2024).
Influence Function / Fisher Information Based: Apply $H^{-1}\nabla L(w; D_e)$ or $F^{-1}\nabla L(w; D_e)$ corrections, with optional noise for privacy guarantees (Li et al., 2024).
Gradient Tracking (DeltaGrad): Maintain and adjust cached gradients along training trajectory for efficient update (Wang et al., 2024).
Bayesian Posterior Unlearning: Update posteriors to match $p(\theta | D_r)$ via reverse-KL and variational optimization (Wang et al., 2024).

3. Representative Algorithms

Split-Shard (SISA) Unlearning

Partition dataset: $D \to \{D^{(1)}, ..., D^{(k)}\}$ .
Train submodels independently, aggregate predictions.
On unlearning, retrain only affected submodel, update aggregate (Wang et al., 2024, Li et al., 2024).

Certified Newton Step (Guo et al.)

For regularized loss $L(\theta; D)$ : $\theta^- = \theta^* - H^{-1} \Delta + \text{noise}(\varepsilon)$ where $H$ is the Hessian on $D_r$ and $\Delta$ the gradient on $D_e$ (Wang et al., 2024).

Coded Protocol

On each unlearning request, subtract erased sample’s contributions from coded shards, retrain only involved weak learners, and average outputs (Aldaghri et al., 2020).

Ticketed Scheme

Encode the data in compact central state and tickets, merge encodings using tickets of survivors to reconstruct unlearned model—enabling polylogarithmic space (Ghazi et al., 2023).

4. Theoretical Analysis and Complexity

Method	Computational Cost	Guarantee
Full Retrain	$O(\|D\|)$	Exact, statistical indistinguishability
SISA Partition	$O(\|D\|/k)$	Exact on retained/affected shards
Certified Newton	$O(d^3)$ (Hessian invert)	$\varepsilon$ -certified removal
Influence/Fisher	$O(m)$ in erased set size	Parameter closeness bound
Coded Protocol	$O$ (#affected weak learners)	Exact on convex models
Ticketed Unlearn	$O(\operatorname{polylog} n)$	Exact for mergeable classes

Capacity under DP unlearning is $m = \frac{n}{d^{1/4}}$ deletions before utility degradation (Wang et al., 2024).

5. Metrics and Verification

Anamnesis Index (AIN): Ratio of relearning time for unlearned vs. retrained model ( $\approx 1$ denotes complete unlearning) (Li et al., 2024).
Activation/Weight Distance: $\|\sigma(w̄^T x)-\sigma(w_r^T x)\|_1$ , $\|w̄-w_r\|_2$ .
Output Distribution KL/JS Divergence
Test Accuracy: Measured on $D_r$ , $D_e$ , and held-out data.
Representation Diagnostics: PCA-similarity, PCA-shift, centered kernel alignment, Fisher information spectrum to detect reversible vs. irreversible forgetting, particularly for large models (Xu et al., 22 May 2025).
Verification: Watermark-based triggers, membership inference, cryptographic proofs (Li et al., 2024).

6. Trade-Offs, Case Studies, and Open Questions

Practical Trade-offs

Aspect	Exact (SISA/Coded)	Approximate (Newton, Influence, Bayesian)
Retrain Cost	Sublinear in $\|D\|$ (per-shard/coded)	Linear in erased set or matrix computation
Storage	Shard/encoding memory required	Gradient/Hessian/VI state, typically smaller
Utility Loss	None (sharded/coded), minimal (encode)	$\varepsilon$ -residual error possible
Complexity	Replay/merge, modular	Hessian/influence/VI machinery
Verification	Direct discriminative comparison	Requires DP or statistical analysis

Empirical Benchmarks

SISA yields retraining speedup up to $10\times$ with $2\times$ storage overhead; test accuracy drop $<0.5\%$ (Wang et al., 2024, Li et al., 2024).
Amnesiac training on MNIST erases $1\%$ of data in $2$ sec; full retrain is $\sim10$ min (Li et al., 2024).
Coded protocols on regression problems achieve $>20\%$ lower MSE at fixed cost compared to uncoded ensembles (Aldaghri et al., 2020).

Reversibility in LLM Unlearning

Shallow perturbations (e.g., to output layer weights) result in "reversible catastrophic forgetting," where original behavior is rapidly restorable, suggesting token-level metrics can misdiagnose true erasure. Only deep, multi-layer perturbations induce "irreversible catastrophic forgetting," detectable via representation-level metrics (PCA, CKA, Fisher spectra) (Xu et al., 22 May 2025). This distinction is central to trustworthy unlearning in large models.

Open Problems

Characterization of concept classes admitting polylog-space ticketed unlearning (Ghazi et al., 2023).
Design of irreversible, formally certified unlearning for deep models (Xu et al., 22 May 2025, Wang et al., 2024).
Integration with privacy constraints and adversarial robustness.
Streaming and multi-round unlearning extension for practical deployments.

7. Architecture and Implementation Considerations

Centralization Requirements: Data partitioning, ticket issuance, code generation, and state management are orchestrated on the server; all computational branches (shard retrain, encoding merge, gradient update) are centrally supervised (Wang et al., 2024, Aldaghri et al., 2020, Ghazi et al., 2023).
Communication Costs: Partitioned/coded and ticketed approaches minimize retraining and communication overhead by only updating affected segments of the model/state, not global model parameters.
Warm-Start Optimization: Valid for convex learners (ridge regression) but full retrain needed for deep networks to guarantee perfect unlearning (Aldaghri et al., 2020, Wang et al., 2024).
Deployment: Exact methods preferred for high-stakes privacy contexts; approximate methods beneficial for routine, low-latency, or high-frequency unlearning requests.

Centralized unlearning remains a rapidly evolving area with ongoing improvements in algorithms, complexity bounds, verification techniques, and large-model deployment strategies. It is instrumental for compliance, privacy, and control in contemporary machine learning infrastructures.

Markdown Report Issue Upgrade to Chat

References (5)

Machine Unlearning: A Comprehensive Survey (2024)

Machine Unlearning: Taxonomy, Metrics, Applications, Challenges, and Prospects (2024)

Ticketed Learning-Unlearning Schemes (2023)

Coded Machine Unlearning (2020)

Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Centralized Unlearning.

Centralized Unlearning Methods

1. Formal Frameworks and Guarantees

Exact Unlearning

Approximate Unlearning

2. Algorithmic Taxonomy

Exact Methods

Approximate Methods

3. Representative Algorithms

Split-Shard (SISA) Unlearning

Certified Newton Step (Guo et al.)

Coded Protocol

Ticketed Scheme

4. Theoretical Analysis and Complexity

5. Metrics and Verification

6. Trade-Offs, Case Studies, and Open Questions

Practical Trade-offs

Empirical Benchmarks

Reversibility in LLM Unlearning

Open Problems

7. Architecture and Implementation Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Centralized Unlearning Methods

1. Formal Frameworks and Guarantees

Exact Unlearning

Approximate Unlearning

2. Algorithmic Taxonomy

Exact Methods

Approximate Methods

3. Representative Algorithms

Split-Shard (SISA) Unlearning

Certified Newton Step (Guo et al.)

Coded Protocol

Ticketed Scheme

4. Theoretical Analysis and Complexity

5. Metrics and Verification

6. Trade-Offs, Case Studies, and Open Questions

Practical Trade-offs

Empirical Benchmarks

Reversibility in LLM Unlearning

Open Problems

7. Architecture and Implementation Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research