Papers
Topics
Authors
Recent
Search
2000 character limit reached

Measuring memorization in language models via probabilistic extraction

Published 25 Oct 2024 in cs.LG | (2410.19482v3)

Abstract: LLMs are susceptible to memorizing training data, raising concerns about the potential extraction of sensitive information at generation time. Discoverable extraction is the most common method for measuring this issue: split a training example into a prefix and suffix, then prompt the LLM with the prefix, and deem the example extractable if the LLM generates the matching suffix using greedy sampling. This definition yields a yes-or-no determination of whether extraction was successful with respect to a single query. Though efficient to compute, we show that this definition is unreliable because it does not account for non-determinism present in more realistic (non-greedy) sampling schemes, for which LLMs produce a range of outputs for the same prompt. We introduce probabilistic discoverable extraction, which, without additional cost, relaxes discoverable extraction by considering multiple queries to quantify the probability of extracting a target sequence. We evaluate our probabilistic measure across different models, sampling schemes, and training-data repetitions, and find that this measure provides more nuanced information about extraction risk compared to traditional discoverable extraction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
  2. Mirostat: A neural text decoding algorithm that directly controls perplexity. arXiv preprint arXiv:2007.14966, 2020.
  3. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR, 2023.
  4. Emergent and predictable memorization in large language models. Advances in Neural Information Processing Systems, 36, 2024.
  5. GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow, Mar. 2021. URL https://doi.org/10.5281/zenodo.5297715.
  6. Elephants never forget: Memorization and learning of tabular data in large language models. arXiv preprint arXiv:2404.06209, 2024.
  7. Audio chord recognition with recurrent neural networks. In ISMIR, pages 335–340. Curitiba, 2013.
  8. Spam filtering using statistical data compression models. The Journal of Machine Learning Research, 7:2673–2698, 2006.
  9. When is memorization of irrelevant training data necessary for high-accuracy learning? In Proceedings of the 53rd annual ACM SIGACT symposium on theory of computing, pages 123–132, 2021.
  10. The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX security symposium (USENIX security 19), pages 267–284, 2019.
  11. Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), pages 2633–2650, 2021.
  12. Quantifying memorization across neural language models. arXiv preprint arXiv:2202.07646, 2022.
  13. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
  14. Do membership inference attacks work on large language models? arXiv preprint arXiv:2402.07841, 2024a.
  15. Uncovering latent memories: Assessing data leakage and memorization patterns in large language models. arXiv preprint arXiv:2406.14549, 2024b.
  16. Hierarchical neural story generation. arXiv preprint arXiv:1805.04833, 2018.
  17. V. Feldman and C. Zhang. What neural networks memorize and why: Discovering the long tail via influence estimation. Advances in Neural Information Processing Systems, 33:2881–2891, 2020.
  18. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027, 2020.
  19. A. Graves. Sequence transduction with recurrent neural networks. arXiv preprint arXiv:1211.3711, 2012.
  20. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
  21. Demystifying verbatim memorization in large language models. arXiv preprint arXiv:2407.17817, 2024.
  22. Alpaca against vicuna: Using llms to uncover memorization of llms. arXiv preprint arXiv:2403.04801, 2024.
  23. Madlad-400: A multilingual and document-level large audited dataset. Advances in Neural Information Processing Systems, 36, 2024.
  24. Deduplicating training data makes language models better. arXiv preprint arXiv:2107.06499, 2021.
  25. Scaling laws for fact memorization of large language models. arXiv preprint arXiv:2406.15720, 2024.
  26. An empirical analysis of memorization in fine-tuned autoregressive language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1816–1826, 2022.
  27. Towards more realistic extraction attacks: An adversarial perspective. arXiv preprint arXiv:2407.02596, 2024.
  28. Scalable extraction of training data from (production) language models. arXiv preprint arXiv:2311.17035, 2023.
  29. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530, 2024.
  30. Formalizing human ingenuity: A quantitative framework for copyright law’s substantial similarity. In Proceedings of the 2022 Symposium on Computer Science and Law, pages 37–49, 2022.
  31. Rethinking llm memorization through the lens of adversarial compression. arXiv preprint arXiv:2404.15146, 2024.
  32. Detecting pretraining data from large language models. arXiv preprint arXiv:2310.16789, 2023.
  33. Identifying and mitigating privacy risks stemming from language models: A survey. arXiv preprint arXiv:2310.01424, 2023.
  34. Beyond memorization: Violating privacy via inference with large language models. arXiv preprint arXiv:2310.07298, 2023.
  35. Assessing privacy risks in language models: A case study on summarization tasks. arXiv preprint arXiv:2310.13291, 2023.
  36. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295, 2024.
  37. Memorization without overfitting: Analyzing the training dynamics of large language models. Advances in Neural Information Processing Systems, 35:38274–38290, 2022.
  38. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424, 2016.
  39. Unlocking memorization in large language models with dynamic soft prompting. arXiv preprint arXiv:2409.13853, 2024.
  40. Analyzing information leakage of updates to natural language models. In Proceedings of the 2020 ACM SIGSAC conference on computer and communications security, pages 363–375, 2020.
  41. Counterfactual memorization in neural language models. Advances in Neural Information Processing Systems, 36:39321–39362, 2023.
  42. Get confused cautiously: Textual sequence memorization erasure with selective entropy maximization. arXiv preprint arXiv:2408.04983, 2024.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 6 tweets with 1 like about this paper.