DUET: Distilled Unlearning from an Efficient Teacher

Updated 5 February 2026

The paper introduces DUET, which achieves selective and persistent forgetting by distilling teacher guidance into a student model without retraining from scratch.
It combines competent teacher predictions on retained data with randomized outputs on forget data via an in-context and adapter-driven framework.
DUET is validated across deep nets and LLMs, offering scalable, data-efficient, and robust unlearning with improved metrics over conventional retraining.

Distilled Unlearning from an Efficient Teacher (DUET) is a family of algorithms for machine unlearning that achieve selective, persistent forgetting by leveraging a distillation-based student-teacher framework. DUET combines the efficiency of in-context steering with the robustness and durability of parameter updates—enabling the removal of targeted knowledge (such as specific user data, private facts, or hazardous content) from large neural networks, particularly LLMs, while preserving the utility of retained knowledge. DUET formulations apply across deep neural architectures and unify prior approaches spanning explicit retraining, AMNESIAC learning, and prompt steering, offering scalable, data-efficient, and attack-resilient unlearning protocols (Chundawat et al., 2022, Chen et al., 2023, Zhong et al., 29 Jan 2026).

1. Formal Problem Statement and Motivation

The machine unlearning objective is: given a trained model $M$ or LLM $\pi_0$ with parameters $\theta$ , a full training set $\mathcal{D}$ , and a designated forget set $\mathcal{D}_f \subset \mathcal{D}$ (e.g., private or undesirable data), efficiently update the model to obtain $M_{\text{unlearn}}$ such that: (i) the influence of $\mathcal{D}_f$ is eradicated ("forgets"), and (ii) predictive quality on a retain set $\mathcal{D}_r = \mathcal{D}\setminus\mathcal{D}_f$ is preserved—without retraining from scratch.

Prior tuning-based unlearning (e.g., parameter optimization using negative loss over $\mathcal{D}_f$ ) is computationally intensive and susceptible to catastrophic forgetting if the balancing term is naively tuned. In-context unlearning via prompt steering is lightweight, but the knowledge removal is superficial and can be bypassed. DUET bridges these extremes by distilling the refusal or randomization behavior of an efficiently contextualized teacher (teacher model or prompt-steered LLM) into a persistent student model, achieving robust, parameter-centric forgetting (Chundawat et al., 2022, Chen et al., 2023, Zhong et al., 29 Jan 2026).

2. DUET Methodological Frameworks

In deep nets, DUET initializes a student $S$ from the original model $\pi_0$ 0 (the "competent teacher" $\pi_0$ 1) and trains it on a union of forget ( $\pi_0$ 2) and retain ( $\pi_0$ 3) batches. Each sample $\pi_0$ 4 is labeled as $\pi_0$ 5 if $\pi_0$ 6 (forget) or $\pi_0$ 7 otherwise (retain). The student imitates $\pi_0$ 8 on $\pi_0$ 9 using KL divergence and an "incompetent" (random or weak) teacher $\theta$ 0 on $\theta$ 1, injecting targeted randomness as follows:

$\theta$ 2

By alternately minimizing this loss over $\theta$ 3 and $\theta$ 4, DUET erases knowledge on $\theta$ 5 by randomizing predictions (via $\theta$ 6), while retaining fidelity on $\theta$ 7 by matching $\theta$ 8.

For LLMs, DUET (Chen et al., 2023) ("Efficient Unlearning"/EUL) inserts tiny, trainable "unlearning adapters" $\theta$ 9 in each Transformer block of a frozen teacher $\mathcal{D}$ 0, forming the student $\mathcal{D}$ 1. The optimization objective is multi-term, balancing selective KL divergence, retention loss, and anti-memorization (negative masked-LM) on the respective data splits:

$\mathcal{D}$ 2

Alternating optimization steps on $\mathcal{D}$ 3 and $\mathcal{D}$ 4 ensures both effective forgetting and knowledge retention.

In the latest DUET formulation (Zhong et al., 29 Jan 2026), a prompt-steered teacher LLM ("in-context refusal") guides the forgetting process. For each query $\mathcal{D}$ 5, the teacher's first-token logits $\mathcal{D}$ 6 (with prefix $\mathcal{D}$ 7) are selectively distilled into a fully parameterized student $\mathcal{D}$ 8 (no prefix) using a Huber-L1 regression on the Top- $\mathcal{D}$ 9 candidate logits:

$\mathcal{D}_f \subset \mathcal{D}$ 0

This logit-centric loss embeds in-context refusal behavior into model parameters, producing persistent forgetting robust to prompt removal or reset.

3. Sequential Unlearning and Fusion

DUET supports accumulation of multiple, non-overlapping unlearning operations without destructive interference. In (Chen et al., 2023), separate adapter sets $\mathcal{D}_f \subset \mathcal{D}$ 1 are trained for each forget request $\mathcal{D}_f \subset \mathcal{D}$ 2, and then linearly fused by solving:

$\mathcal{D}_f \subset \mathcal{D}$ 3

with the closed-form solution: $\mathcal{D}_f \subset \mathcal{D}$ 4 where $\mathcal{D}_f \subset \mathcal{D}$ 5 are hidden representations for $\mathcal{D}_f \subset \mathcal{D}$ 6 just before the adapter. This mechanism supports efficient, "post hoc" fusion without additional backpropagation, enabling responsive deployment in settings with streaming deletion requests.

4. Quantitative Evaluation Protocols

Evaluation uses a variety of metrics to ensure that forgetting and retention are jointly quantified.

Zero Retrain Forgetting (ZRF) Metric (Chundawat et al., 2022): Uses Jensen–Shannon divergence between the unlearned model and the incompetent teacher on $\mathcal{D}_f \subset \mathcal{D}$ 7:

$\mathcal{D}_f \subset \mathcal{D}$ 8

Values near 1 indicate the model mimics random guessing on forgotten samples.

Task-Specific Metrics (Chen et al., 2023, Zhong et al., 29 Jan 2026):
- Classification accuracy on test, retain, and forget splits.
- ROUGE-L F1 on QA sets for forgetting (R-Forget, low desired), utility retention (R-Retain, high desired).
- MMLU multi-choice accuracy for general capabilities.
- Masked-LM loss on forget data, membership inference, and attack resilience.
Efficiency: Adapter-based and logit-distilled DUET variants are orders of magnitude faster and dramatically more data- and compute-efficient than tuning-based or retrain-from-scratch baselines.

5. Empirical Findings

Extensive experimental studies spanning image classification (Chundawat et al., 2022), summarization and sentiment analysis (Chen et al., 2023), and knowledge-based QA (Zhong et al., 29 Jan 2026) establish that DUET achieves state-of-the-art trade-offs between forgetting (reduction of knowledge on $\mathcal{D}_f \subset \mathcal{D}$ 9) and retention (maintenance of $M_{\text{unlearn}}$ 0 utility):

Model / Method	Forget Accuracy (↓)	Retain Accuracy (↑)	Utility (MMLU/ROUGE, ↑)	Training Time (s)
Retrain (gold)	Near-chance	≈100%	Highest	High
DUET (adapters)	4‒57% (domain/task)	71–99%	Matches gold	2–20× < retrain
Incompetent teacher	~random	≈ baseline	Maintains generalization	-

On MUSE-Books with Llama-3.2B, DUET achieves $M_{\text{unlearn}}$ 1-Forget = 4.27 (vs base $M_{\text{unlearn}}$ 2), $M_{\text{unlearn}}$ 3-Retain = $M_{\text{unlearn}}$ 4 ( $M_{\text{unlearn}}$ 5 base), and MMLU $M_{\text{unlearn}}$ 6 ( $M_{\text{unlearn}}$ 7 base), with a joint score improvement $M_{\text{unlearn}}$ 8 over tuning-based and flat baselines. On WMDP-Bio/Cyber, DUET achieves the lowest Acc-Forget and highest MMLU relative to established methods (Zhong et al., 29 Jan 2026).

Sequential fusion (“DUET-fuse”) consistently reduces forgot-set accuracy while preserving test accuracy, outperforming sequential fine-tuning (Chen et al., 2023).

6. Limitations, Robustness, and Open Questions

DUET does not provide formal PAC-style or differential privacy guarantees; empirical effectiveness is certified via surface metrics and attack simulations. Sophisticated jailbreak or reverse engineering attacks maintain some non-trivial success rates (ASR ~ 35% (Zhong et al., 29 Jan 2026)), indicating residual extractable knowledge. Precise boundaries between safe and forbidden knowledge remain under-defined and rely on the quality and specificity of refusal prompts (Zhong et al., 29 Jan 2026). Current evaluation regimes depend on output-level verification; deeper latent-space auditing and membership inference probing are proposed directions. Compute scalability for continuous/delete-all streaming and federated settings is unresolved.

7. Significance and Comparative Analysis

DUET is the first unlearning strategy to unify (a) the efficiency and semantic specificity of prompt/in-context teacher construction and (b) the parameter persistence of fine-tuning or adapter-style descent. It achieves state-of-the-art fogetting–retention trade-offs, with high data efficiency ( $M_{\text{unlearn}}$ 92k tokens per request), robustness to reverse prompt attacks, and infrastructure for post hoc sequential fusion (Chundawat et al., 2022, Chen et al., 2023, Zhong et al., 29 Jan 2026). A notable implication is that logit-level distillation from efficiently contextualized teachers enables embedding of targeted refusal behavior directly into model parameters, bridging the gap between ephemeral steering and heavyweight retraining.

Further research may explore extension to regression, structured prediction, federated removal, robust adversarial unlearning, and principled certification of unlearning beyond empirical benchmarks.

Markdown Report Issue Upgrade to Chat

References (3)

Can Bad Teaching Induce Forgetting? Unlearning in Deep Networks using an Incompetent Teacher (2022)

Unlearn What You Want to Forget: Efficient Unlearning for LLMs (2023)

DUET: Distilled LLM Unlearning from an Efficiently Contextualized Teacher (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Distilled Unlearning from an Efficient Teacher (DUET).

DUET: Distilled Unlearning from an Efficient Teacher

1. Formal Problem Statement and Motivation

2. DUET Methodological Frameworks

2.1 Classical Deep Nets: Competent-Incompetent Teacher Distillation (Chundawat et al., 2022)

2.2 LLMs with Lightweight Adapters and Selective Distillation (Chen et al., 2023)

2.3 Logit-Level Distillation from Efficiently Contextualized Teachers (Zhong et al., 29 Jan 2026)

3. Sequential Unlearning and Fusion

4. Quantitative Evaluation Protocols

5. Empirical Findings

6. Limitations, Robustness, and Open Questions

7. Significance and Comparative Analysis

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

DUET: Distilled Unlearning from an Efficient Teacher

1. Formal Problem Statement and Motivation

2. DUET Methodological Frameworks

2.1 Classical Deep Nets: Competent-Incompetent Teacher Distillation (Chundawat et al., 2022)

2.2 LLMs with Lightweight Adapters and Selective Distillation (Chen et al., 2023)

2.3 Logit-Level Distillation from Efficiently Contextualized Teachers (Zhong et al., 29 Jan 2026)

3. Sequential Unlearning and Fusion

4. Quantitative Evaluation Protocols

5. Empirical Findings

6. Limitations, Robustness, and Open Questions

7. Significance and Comparative Analysis

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics