PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners (2310.02469v3)
Abstract: The proliferation of LLMs has driven considerable interest in fine-tuning them with domain-specific data to create specialized LLMs. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning of LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time. To address this challenge, we introduce Contextual Privacy Protection LLMs (PrivacyMind), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy. Our work offers a theoretical analysis for model design and benchmarks various techniques such as corpus curation, penalty-based unlikelihood in training loss, instruction-based tuning, etc. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples stands out as a promising method, effectively protecting private data while enhancing the model's knowledge. Our work underscores the potential for LLMs as robust contextual privacy protection learners. The complete code and data for the work can be found at https://github.com/Yijia-Xiao/PrivacyMind.
- Large-scale differentially private BERT. In Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 6481–6491, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.findings-emnlp.484. URL https://aclanthology.org/2022.findings-emnlp.484.
- Training a helpful and harmless assistant with reinforcement learning from human feedback. ArXiv, abs/2204.05862, 2022. URL https://api.semanticscholar.org/CorpusID:248118878.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity, 2023. URL http://arxiv.org/abs/2302.04023.
- Extracting training data from large language models. In USENIX Security Symposium, 2020. URL https://api.semanticscholar.org/CorpusID:229156229.
- Quantifying memorization across neural language models. ArXiv, abs/2202.07646, 2022. URL https://api.semanticscholar.org/CorpusID:246863735.
- Chatgpt evaluation on sentence level relations: A focus on temporal, causal, and discourse relations. arXiv:2304.14827, 2023.
- Semi-offline reinforcement learning for optimized text generation. In ICML, 2023.
- Privacy-preserving neural representations of text, 2018.
- Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 320–335, 2022.
- Helixfold-single: Msa-free protein structure prediction by using protein language model as an alternative. ArXiv, abs/2207.13921, 2022. URL https://api.semanticscholar.org/CorpusID:251135365.
- Realtoxicityprompts: Evaluating neural toxic degeneration in language models. In Findings, 2020. URL https://api.semanticscholar.org/CorpusID:221878771.
- Google. Bard, 2023. URL https://bard.google.com/.
- Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247, 2023.
- Using self-supervised learning can improve model robustness and uncertainty. ArXiv, abs/1906.12340, 2019. URL https://api.semanticscholar.org/CorpusID:195750576.
- Training compute-optimal large language models. ArXiv, abs/2203.15556, 2022a. URL https://api.semanticscholar.org/CorpusID:247778764.
- An empirical analysis of compute-optimal large language model training. In Neural Information Processing Systems, 2022b. URL https://api.semanticscholar.org/CorpusID:258509679.
- Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022.
- Prototypical fine-tuning: Towards robust performance under varying data sizes. In Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
- Chatgpt for good? on opportunities and challenges of large language models for education. Learning and individual differences, 103:102274, 2023.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL, pp. 4171–4186, 2019.
- Pretraining language models with human preferences. ArXiv, abs/2302.08582, 2023. URL https://api.semanticscholar.org/CorpusID:257020046.
- Large language models can be strong differentially private learners, 2022.
- Privacy-preserving prompt tuning for large language model services, 2023.
- Chin-Yew Lin. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pp. 74–81, Barcelona, Spain, July 2004. Association for Computational Linguistics.
- Truthfulqa: Measuring how models mimic human falsehoods. In Annual Meeting of the Association for Computational Linguistics, 2021. URL https://api.semanticscholar.org/CorpusID:237532606.
- Privaterec: Differentially private training and serving for federated news recommendation. 2023.
- Tuning language models as training data generators for augmentation-enhanced few-shot learning. In International Conference on Machine Learning, 2023.
- Teaching language models to support answers with verified quotes. ArXiv, abs/2203.11147, 2022. URL https://api.semanticscholar.org/CorpusID:247594830.
- OpenAI. Gpt-4 technical report. Arxiv Preprint, arXiv:2303.08774, 2023. URL https://arxiv.org/abs/2303.08774.
- Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155, 2022. URL https://api.semanticscholar.org/CorpusID:246426909.
- Pytorch: An imperative style, high-performance deep learning library, 2019.
- Rwkv: Reinventing rnns for the transformer era, 2023.
- Direct preference optimization: Your language model is secretly a reward model, 2023.
- Effect of scale on catastrophic forgetting in neural networks. In International Conference on Learning Representations, 2022. URL https://api.semanticscholar.org/CorpusID:251648120.
- Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences of the United States of America, 118, 2019. URL https://api.semanticscholar.org/CorpusID:155162335.
- Selective differential privacy for language modeling. arXiv preprint arXiv:2108.12944, 2021.
- Large language models encode clinical knowledge. Nature, 620(7972):172–180, August 2023a. ISSN 1476-4687. doi: 10.1038/s41586-023-06291-2. URL https://doi.org/10.1038/s41586-023-06291-2.
- Towards expert-level medical question answering with large language models. arXiv:2305.09617, 2023b.
- Process for adapting language models to society (palms) with values-targeted datasets. ArXiv, abs/2106.10328, 2021. URL https://api.semanticscholar.org/CorpusID:235489789.
- Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca, 2023.
- Llama: Open and efficient foundation language models, 2023.
- Will we run out of data? an analysis of the limits of scaling datasets in machine learning. ArXiv, abs/2211.04325, 2022. URL https://api.semanticscholar.org/CorpusID:253397775.
- Overcoming catastrophic forgetting in zero-shot cross-lingual generation. In Conference on Empirical Methods in Natural Language Processing, 2022. URL https://api.semanticscholar.org/CorpusID:249062610.
- Poisoning language models during instruction tuning. In International Conference on Machine Learning, 2023.
- On the robustness of chatgpt: An adversarial and out-of-distribution perspective. In ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models, 2023a.
- Self-instruct: Aligning language model with self generated instructions, 2022.
- Aligning large language models with human: A survey. arXiv preprint arXiv:2307.12966, 2023b.
- Challenges in detoxifying language models. ArXiv, abs/2109.07445, 2021. URL https://api.semanticscholar.org/CorpusID:237513578.
- Bloomberggpt: A large language model for finance, 2023.
- Recipes for safety in open-domain chatbots. ArXiv, abs/2010.07079, 2020. URL https://api.semanticscholar.org/CorpusID:222341902.
- Fingpt: Open-source financial large language models. ArXiv, abs/2306.06031, 2023. URL https://api.semanticscholar.org/CorpusID:259129734.
- Gatortron: A large clinical language model to unlock patient information from unstructured electronic health records. ArXiv, abs/2203.03540, 2022. URL https://api.semanticscholar.org/CorpusID:247157824.
- Differentially private fine-tuning of language models. In International Conference on Learning Representations (ICLR), 2022.
- G22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTuardfl: Safeguarding federated learning against backdoor attacks through attributed client graph clustering, 2023a.
- Bag of tricks for training data extraction from language models. In ICML, 2023b.
- Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations, 2020.
- Enhancing small medical learners with privacy-preserving contextual prompting, 2023.
- Fldp: Flexible strategy for local differential privacy. In ICASSP, pp. 2974–2978. IEEE, 2022.
- Controlled text generation with natural language instructions. In Proceedings of the 40th International Conference on Machine Learning, volume 202, pp. 42602–42613. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/zhou23g.html.
- Fine-tuning language models from human preferences, 2020.
- Adversarial training for high-stakes reliability. ArXiv, abs/2205.01663, 2022. URL https://api.semanticscholar.org/CorpusID:248506146.
- Yijia Xiao (19 papers)
- Yiqiao Jin (27 papers)
- Yushi Bai (31 papers)
- Yue Wu (338 papers)
- Xianjun Yang (37 papers)
- Xiao Luo (111 papers)
- Wenchao Yu (23 papers)
- Xujiang Zhao (26 papers)
- Yanchi Liu (41 papers)
- Haifeng Chen (99 papers)
- Wei Wang (1793 papers)
- Wei Cheng (175 papers)
- Quanquan Gu (198 papers)