Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning (2410.07163v2)

Published 9 Oct 2024 in cs.CL, cs.AI, and cs.LG

Abstract: In this work, we address the problem of LLM unlearning, aiming to remove unwanted data influences and associated model capabilities (e.g., copyrighted data or harmful content generation) while preserving essential model utilities, without the need for retraining from scratch. Despite the growing need for LLM unlearning, a principled optimization framework remains lacking. To this end, we revisit the state-of-the-art approach, negative preference optimization (NPO), and identify the issue of reference model bias, which could undermine NPO's effectiveness, particularly when unlearning forget data of varying difficulty. Given that, we propose a simple yet effective unlearning optimization framework, called SimNPO, showing that 'simplicity' in removing the reliance on a reference model (through the lens of simple preference optimization) benefits unlearning. We also provide deeper insights into SimNPO's advantages, supported by analysis using mixtures of Markov chains. Furthermore, we present extensive experiments validating SimNPO's superiority over existing unlearning baselines in benchmarks like TOFU and MUSE, and robustness against relearning attacks. Codes are available at https://github.com/OPTML-Group/Unlearn-Simple.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Chongyu Fan (9 papers)
  2. Jiancheng Liu (19 papers)
  3. Licong Lin (17 papers)
  4. Jinghan Jia (30 papers)
  5. Ruiqi Zhang (58 papers)
  6. Song Mei (56 papers)
  7. Sijia Liu (204 papers)
Citations (1)