Quantifying Memorization and Detecting Training Data of Pre-trained Language Models using Japanese Newspaper (2404.17143v2)
Abstract: Dominant pre-trained LLMs (PLMs) have demonstrated the potential risk of memorizing and outputting the training data. While this concern has been discussed mainly in English, it is also practically important to focus on domain-specific PLMs. In this study, we pre-trained domain-specific GPT-2 models using a limited corpus of Japanese newspaper articles and evaluated their behavior. Experiments replicated the empirical finding that memorization of PLMs is related to the duplication in the training data, model size, and prompt length, in Japanese the same as in previous English studies. Furthermore, we attempted membership inference attacks, demonstrating that the training data can be detected even in Japanese, which is the same trend as in English. The study warns that domain-specific PLMs, sometimes trained with valuable private data, can ''copy and paste'' on a large scale.
- Can we trust the evaluation on ChatGPT? arXiv preprint arXiv:2303.12767.
- Deep variational information bottleneck. In Proceedings of the 5th International Conference on Learning Representations.
- Miltiadis Allamanis. 2019. The adverse effects of code duplication in machine learning models of code. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2019, pages 143–153, New York, NY, USA. Association for Computing Machinery.
- On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, pages 610–623, New York, NY, USA. Association for Computing Machinery.
- GPT-NeoX-20B: An open-source autoregressive language model. In Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models, pages 95–136, virtual+Dublin. Association for Computational Linguistics.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Quantifying memorization across neural language models. In Proceedings of the 11th International Conference on Learning Representations.
- The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX Security 19), pages 267–284.
- Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), pages 2633–2650.
- Andrea Carson. 2015. Behind the newspaper paywall – lessons in charging for online content: a comparative analysis of why australian newspapers are stuck in the purgatorial space between digital and print. Media Culture & Society, 37(7):1022–1041.
- Obfuscation-resilient privacy leak detection for mobile apps through differential analysis. In Proceedings 2017 Network and Distributed System Security Symposium, Reston, VA. Internet Society.
- Planting and mitigating memorized content in Predictive-Text language models. arXiv preprint arXiv:2212.08619.
- Giorgio Franceschelli and Mirco Musolesi. 2023. On the creativity of large language models. arXiv preprint arXiv:2304.00008.
- The pile: An 800GB dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027.
- Formalizing data deletion in the context of the right to be forgotten. In Advances in Cryptology – EUROCRYPT 2020, pages 373–402. Springer International Publishing.
- Making AI forget you: data deletion in machine learning. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, NIPS’19, pages 3518–3531, Red Hook, NY, USA. Curran Associates Inc.
- Unifying human and statistical evaluation for natural language generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1689–1701, Minneapolis, Minnesota. Association for Computational Linguistics.
- Exploring the limits of differentially private deep learning with group-wise clipping. In Proceedings of the 11th International Conference on Learning Representations.
- James Henderson and Fabio James Fehr. 2023. A VAE for transformers with nonparametric variational information bottleneck. In Proceedings of the 11th International Conference on Learning Representations.
- Pile of law: Learning responsible data filtering from the law and a 256GB open-source legal dataset. In 36th Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Are large pre-trained language models leaking your personal information? In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2038–2047, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Preventing verbatim memorization in language models gives a false sense of privacy. arXiv preprint arXiv:2210.17546.
- Shotaro Ishihara. 2023. Training data extraction from pre-trained language models: A survey. In Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP).
- Stop uploading test data in plain text: Practical strategies for mitigating data contamination by evaluation benchmarks. arXiv preprint arXiv:2305.10160.
- Membership inference attack susceptibility of clinical language models. arXiv preprint arXiv:2104.08305.
- Auditing differentially private machine learning: how private is private SGD? In Proceedings of the 34th International Conference on Neural Information Processing Systems, number Article 1862 in NIPS’20, pages 22205–22216, Red Hook, NY, USA. Curran Associates Inc.
- Deduplicating training data mitigates privacy risks in language models. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 10697–10707. PMLR.
- Newspapers’ content policy and the effect of paywalls on pageviews. Journal of interactive marketing, 49(1):54–69.
- Evaluating the factual consistency of abstractive text summarization. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9332–9346, Online. Association for Computational Linguistics.
- Taku Kudo. 2018. Subword regularization: Improving neural network translation models with multiple subword candidates. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 66–75, Melbourne, Australia. Association for Computational Linguistics.
- Do language models plagiarize? In Proceedings of the ACM Web Conference 2023, WWW ’23, page 3637–3647, New York, NY, USA. Association for Computing Machinery.
- Deduplicating training data makes language models better. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8424–8445, Dublin, Ireland. Association for Computational Linguistics.
- Does BERT pretrained on clinical notes reveal sensitive data? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 946–959, Online. Association for Computational Linguistics.
- Humans forget, machines remember: Artificial intelligence and the right to be forgotten. Computer Law & Security Review, 34(2):304.
- Large language models can be strong differentially private learners. In Proceedings of the 10th International Conference on Learning Representations.
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations.
- Inbal Magar and Roy Schwartz. 2022. Data contamination: From memorization to exploitation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 157–165, Dublin, Ireland. Association for Computational Linguistics.
- How much do language models copy from their training data? evaluating linguistic novelty in text generation using RAVEN. Transactions of the Association for Computational Linguistics, 11:652–670.
- GLEU: Automatic evaluation of sentence-level fluency. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 344–351, Prague, Czech Republic. Association for Computational Linguistics.
- Merja Myllylahti. 2014. Newspaper paywalls—the hype and the reality. Digital journalism, 2(2):179–194.
- Merja Myllylahti. 2016. Newspaper paywalls and corporate revenues: A comparative study. In The Routledge companion to digital journalism studies, pages 166–175. Routledge.
- KART: Parameterization of privacy leakage scenarios from pre-trained language models. arXiv preprint arXiv:2101.00036.
- Adversary instantiation: Lower bounds for differentially private machine learning. In 2021 IEEE Symposium on Security and Privacy (SP), volume 0, pages 866–882.
- Red teaming language models with language models. arXiv preprint arXiv:2202.03286.
- Improving language understanding by generative pre-training.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- ReCon: Revealing and controlling PII leaks in mobile network traffic. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys ’16, page 361–374. Association for Computing Machinery.
- Quantifying association capabilities of large language models and its implications on privacy leakage. arXiv preprint arXiv:2305.12707.
- Noam Shazeer and Mitchell Stern. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4596–4604. PMLR.
- Detecting pretraining data from large language models. In The Twelfth International Conference on Learning Representations.
- Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), pages 3–18.
- Large language models encode clinical knowledge. arXiv preprint arXiv:2212.13138.
- Helle Sjøvaag. 2016. Introducing the paywall. Journalism Practice, 10(3):304–322.
- Memorization without overfitting: Analyzing the training dynamics of large language models. In Advances in Neural Information Processing Systems.
- Downstream task performance of BERT models pre-trained using automatically de-identified clinical data. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4245–4252, Marseille, France. European Language Resources Association.
- Taxonomy of risks posed by language models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, pages 214–229, New York, NY, USA. Association for Computing Machinery.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. arXiv preprint arXiv:2304.13712.
- A large language model for electronic health records. NPJ digital medicine, 5(1):194.
- Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF), pages 268–282.
- Differentially private fine-tuning of language models. In Proceedings of the 10th International Conference on Learning Representations.
- Large scale private learning via low-rank reparametrization. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12208–12218. PMLR.
- A normalized levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1091–1095.
- Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3):107–115.
- Counterfactual memorization in neural language models. arXiv preprint arXiv:2112.12938.
- A survey of large language models. arXiv preprint arXiv:2303.18223.
- Texygen: A benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR ’18, page 1097–1100, New York, NY, USA. Association for Computing Machinery.
- Shotaro Ishihara (3 papers)
- Hiromu Takahashi (1 paper)