Evaluate the effectiveness of machine unlearning in large language models
Determine a rigorous, standardized methodology to evaluate the effectiveness of machine unlearning in large language models, including clear criteria and metrics for assessing whether targeted knowledge has been removed rather than merely suppressed.
References
Unlearning in LLMs is crucial for managing sensitive data and correcting misinformation, yet evaluating its effectiveness remains an open problem.
— The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework
(2510.25732 - Shah et al., 29 Oct 2025) in Abstract