Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
112 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
55 tokens/sec
2000 character limit reached

A mean teacher algorithm for unlearning of language models (2504.13388v1)

Published 18 Apr 2025 in cs.LG and cs.CL

Abstract: One of the goals of LLM unlearning is to reduce memorization of selected text instances while retaining the model's general abilities. Despite various proposed methods, reducing memorization of large datasets without noticeable degradation in model utility remains challenging. In this paper, we investigate the mean teacher algorithm (Tarvainen & Valpola, 2017), a simple proximal optimization method from continual learning literature that gradually modifies the teacher model. We show that the mean teacher can approximate a trajectory of a slow natural gradient descent (NGD), which inherently seeks low-curvature updates that are less likely to degrade the model utility. While slow NGD can suffer from vanishing gradients, we introduce a new unlearning loss called "negative log-unlikelihood" (NLUL) that avoids this problem. We show that the combination of mean teacher and NLUL improves some metrics on the MUSE benchmarks (Shi et al., 2024).

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (1)