Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions (2307.03941v4)

Published 8 Jul 2023 in cs.CY, cs.AI, and cs.CL

Abstract: The Right to be Forgotten (RTBF) was first established as the result of the ruling of Google Spain SL, Google Inc. v AEPD, Mario Costeja Gonz\'alez, and was later included as the Right to Erasure under the General Data Protection Regulation (GDPR) of European Union to allow individuals the right to request personal data be deleted by organizations. Specifically for search engines, individuals can send requests to organizations to exclude their information from the query results. It was a significant emergent right as the result of the evolution of technology. With the recent development of LLMs and their use in chatbots, LLM-enabled software systems have become popular. But they are not excluded from the RTBF. Compared with the indexing approach used by search engines, LLMs store, and process information in a completely different way. This poses new challenges for compliance with the RTBF. In this paper, we explore these challenges and provide our insights on how to implement technical solutions for the RTBF, including the use of differential privacy, machine unlearning, model editing, and guardrails. With the rapid advancement of AI and the increasing need of regulating this powerful technology, learning from the case of RTBF can provide valuable lessons for technical practitioners, legal experts, organizations, and authorities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. T. Bertram, E. Bursztein, S. Caro, H. Chao, R. C. Feman, P. Fleischer, A. Gustafsson, J. Hemerly, C. Hibbert, L. Invernizzi, L. K. Donnelly, J. Ketover, J. Laefer, P. Nicholas, Y. Niu, H. Obhi, D. Price, A. Strait, K. Thomas, and A. Verney, “Five years of the right to be forgotten,” in Proceedings of the Conference on Computer and Communications Security, 2019.
  2. W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong et al., “A survey of large language models,” arXiv preprint arXiv:2303.18223, 2023.
  3. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” 2020.
  4. A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, P. Barham, H. W. Chung, C. Sutton, S. Gehrmann, P. Schuh, K. Shi, S. Tsvyashchenko, J. Maynez, A. Rao, P. Barnes, Y. Tay, N. Shazeer, V. Prabhakaran, E. Reif, N. Du, B. Hutchinson, R. Pope, J. Bradbury, J. Austin, M. Isard, G. Gur-Ari, P. Yin, T. Duke, A. Levskaya, S. Ghemawat, S. Dev, H. Michalewski, X. Garcia, V. Misra, K. Robinson, L. Fedus, D. Zhou, D. Ippolito, D. Luan, H. Lim, B. Zoph, A. Spiridonov, R. Sepassi, D. Dohan, S. Agrawal, M. Omernick, A. M. Dai, T. S. Pillai, M. Pellat, A. Lewkowycz, E. Moreira, R. Child, O. Polozov, K. Lee, Z. Zhou, X. Wang, B. Saeta, M. Diaz, O. Firat, M. Catasta, J. Wei, K. Meier-Hellstern, D. Eck, J. Dean, S. Petrov, and N. Fiedel, “Palm: Scaling language modeling with pathways,” 2022.
  5. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  6. Y. Yang, S. Yuan, D. Cer, S.-y. Kong, N. Constant, P. Pilar, H. Ge, Y.-H. Sung, B. Strope, and R. Kurzweil, “Learning semantic textual similarity from conversations,” arXiv preprint arXiv:1804.07754, 2018.
  7. OpenAI, “Gpt-4 technical report,” 2023.
  8. S. Edunov, A. Baevski, and M. Auli, “Pre-trained language model representations for language generation,” arXiv preprint arXiv:1903.09722, 2019.
  9. A. Wang and K. Cho, “Bert has a mouth, and it must speak: Bert as a markov random field language model,” arXiv preprint arXiv:1902.04094, 2019.
  10. L. Floridi and M. Chiriatti, “Gpt-3: Its nature, scope, limits, and consequences,” Minds and Machines, vol. 30, pp. 681–694, 2020.
  11. J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler et al., “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022.
  12. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, 2022.
  13. N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. B. Brown, D. Song, U. Erlingsson et al., “Extracting training data from large language models.” in USENIX Security Symposium, vol. 6, 2021.
  14. J. Maynez, S. Narayan, B. Bohnet, and R. T. Mcdonald, “On faithfulness and factuality in abstractive summarization,” in Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
  15. N. F. Liu, T. Zhang, and P. Liang, “Evaluating verifiability in generative search engines,” 2023.
  16. K. Saffarizadeh, M. Boodraj, T. M. Alashoor et al., “Conversational assistants: Investigating privacy concerns, trust, and self-disclosure.” in ICIS, 2017.
  17. M. Al-Rubaie and J. M. Chang, “Privacy-preserving machine learning: Threats and solutions,” IEEE Security & Privacy, vol. 17, no. 2, pp. 49–58, 2019.
  18. X. Yue, H. Inan, X. Li, G. Kumar, J. McAnallen, H. Shajari, H. Sun, D. Levitan, and R. Sim, “Synthetic text generation with differential privacy: A simple and practical recipe,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 1321–1342. [Online]. Available: https://aclanthology.org/2023.acl-long.74
  19. Y. Cao and J. Yang, “Towards making systems forget with machine unlearning,” in 2015 IEEE Symposium on Security and Privacy.   IEEE, 2015, pp. 463–480.
  20. L. Bourtoule, V. Chandrasekaran, C. A. Choquette-Choo, H. Jia, A. Travers, B. Zhang, D. Lie, and N. Papernot, “Machine unlearning,” in 2021 IEEE Symposium on Security and Privacy (SP).   IEEE, 2021, pp. 141–159.
  21. K. Koch and M. Soll, “No matter how you slice it: Machine unlearning with SISA comes at the expense of minority classes,” in First IEEE Conference on Secure and Trustworthy Machine Learning, 2023. [Online]. Available: https://openreview.net/forum?id=RBX1H-SGdT
  22. D. Zhang, S. Pan, T. Hoang, Z. Xing, M. Staples, X. Xu, L. Yao, Q. Lu, and L. Zhu, “To be forgotten or to be fair: Unveiling fairness implications of machine unlearning methods,” arXiv preprint arXiv:2302.03350, 2023.
  23. C. Guo, T. Goldstein, A. Hannun, and L. Van Der Maaten, “Certified data removal from machine learning models,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119.   PMLR, 13–18 Jul 2020, pp. 3832–3842. [Online]. Available: https://proceedings.mlr.press/v119/guo20c.html
  24. A. Golatkar, A. Achille, and S. Soatto, “Eternal sunshine of the spotless net: Selective forgetting in deep networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9304–9312.
  25. L. Graves, V. Nagisetty, and V. Ganesh, “Amnesiac machine learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 13, pp. 11 516–11 524, May 2021. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/17371
  26. H. Hu, S. Wang, J. Chang, H. Zhong, R. Sun, S. Hao, H. Zhu, and M. Xue, “A duty to forget, a right to be assured? exposing vulnerabilities in machine unlearning services,” 2023.
  27. E. Mitchell, C. Lin, A. Bosselut, C. Finn, and C. D. Manning, “Fast model editing at scale,” arXiv preprint arXiv:2110.11309, 2021.
  28. E. Mitchell, C. Lin, A. Bosselut, C. D. Manning, and C. Finn, “Memory-based model editing at scale,” in International Conference on Machine Learning.   PMLR, 2022, pp. 15 817–15 831.
  29. E. F. Villaronga, P. Kieseberg, and T. Li, “Humans forget, machines remember: Artificial intelligence and the right to be forgotten,” Computer Law & Security Review, vol. 34, no. 2, pp. 304–313, 2018.
  30. Q.-V. Dang, “Right to be forgotten in the age of machine learning,” in Advances in Digital Science: ICADS 2021.   Springer, 2021, pp. 403–411.
  31. E. Esposito, “Algorithmic memory and the right to be forgotten on the web,” Big Data & Society, vol. 4, no. 1, p. 2053951717703996, 2017.
  32. D. Lindsay, “The ‘right to be forgotten’by search engines under data privacy law: A legal analysis of the costeja ruling,” Journal of Media Law, vol. 6, no. 2, pp. 159–179, 2014.
Citations (41)

Summary

We haven't generated a summary for this paper yet.