Generalization of SKeB findings to sensitive, non-fictional domains
Ascertain whether the observed relationships between persuasive prompt framing, domain graph entanglement metrics, and factual recall in unlearned models generalize to sensitive domains such as personally identifiable information, harmful content, and copyrighted material, given potential differences in how fictional versus factual/personal information is encoded.
References
Whether out findings generalize to more sensitive domains (PII, harmful content, copyrighted material) remains an open research direction, as fictional knowledge may be encoded differently than factual/personal information.
— The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework
(2510.25732 - Shah et al., 29 Oct 2025) in Section: Limitations