Are Large Pre-Trained Language Models Leaking Your Personal Information? (2205.12628v2)

Published 25 May 2022 in cs.CL, cs.AI, and cs.CR

Abstract: Are Large Pre-Trained LLMs Leaking Your Personal Information? In this paper, we analyze whether Pre-Trained LLMs (PLMs) are prone to leaking personal information. Specifically, we query PLMs for email addresses with contexts of the email address or prompts containing the owner's name. We find that PLMs do leak personal information due to memorization. However, since the models are weak at association, the risk of specific personal information being extracted by attackers is low. We hope this work could help the community to better understand the privacy risk of PLMs and bring new insights to make PLMs safe.

Citations (142)

View on Semantic Scholar

Summary

The paper demonstrates that PLMs can memorize personal data, achieving up to 8.80% accuracy in predicting email addresses under extended contexts.
It distinguishes between memorization and association, showing that while models recall data effectively, they struggle to link it to specific individuals.
The authors suggest mitigation strategies, including deduplication and differential privacy, to reduce the risk of accidental personal information leakage.

Privacy Risks of Large Pre-Trained LLMs

The research article titled "Are Large Pre-Trained LLMs Leaking Your Personal Information?" provides an in-depth examination of privacy concerns associated with Pre-Trained LLMs (PLMs), particularly whether these models are susceptible to leaking personal information due to memorization and association. In light of the increasing adoption and deployment of PLMs such as GPT-Neo, this paper scrutinizes the capability of PLMs to recall training data, specifically email addresses, and the resulting privacy implications.

The paper begins by distinguishing between two capacities responsible for privacy leakage: memorization, wherein PLMs retain personal information, and association, where PLMs can potentially link personal information with its owner through specific prompts. This dichotomy is crucial as it sheds light on the mechanisms that might underpin privacy leakage in PLMs.

To assess these capacities, the authors conduct experiments using various prompt settings targeting the memorization and association of email addresses. Their findings reveal that PLMs exhibit significant memorization capabilities, notably when provided with extended context, achieving an accuracy of up to 8.80% in predicting email addresses correctly within the largest GPT-Neo model. In contrast, the accuracy drops considerably in scenarios where association is the focus, especially when the domain knowledge is unknown, reflecting the models' limited association capabilities.

Additionally, the article discourses on the vulnerability of larger PLMs, which tend to have a higher propensity for memorizing training data, thereby elevating privacy risks. This is evidenced by the improved prediction accuracy correlating with increased model parameters.

On the practical front, the implications are manifold. While PLMs can memorize data effectively, they are weak at associating this information with specific individuals, thus rendering targeted attacks less feasible. Nonetheless, the accidental leakage of personal information remains a critical concern, particularly if long text patterns or repetitive content are present in training data.

The authors suggest several mitigation strategies to counteract these risks, including pre-processing techniques like deduplication, employing differential privacy during training, and post-processing measures for filtering sensitive data in model outputs. These strategies highlight the importance of privacy preservation in the developmental pipeline of PLMs.

From a theoretical standpoint, the distinction between memorization and association prompts further exploration into defining and quantifying these phenomena in LLMs. This paper lays the groundwork for comprehensively understanding privacy leakages and catalyzes advancements in privacy-preserving techniques for PLMs.

In conclusion, while specific personal information leakage is low, PLMs still pose privacy risks due to their memorization strength, warranting careful consideration and application of privacy-conscious methodologies in their deployment. Future developments should focus on enhancing models' ability to protect sensitive data effectively, fostering a balance between model performance and privacy safety.

PDF Markdown

Related Papers

Tweets

https://twitter.com/886779698239332352/status/1740750722524704774

https://twitter.com/1487345386226614278/status/1740914147695739136