Beyond Gradient and Priors in Privacy Attacks: Leveraging Pooler Layer Inputs of Language Models in Federated Learning (2312.05720v4)
Abstract: LLMs trained via federated learning (FL) demonstrate impressive capabilities in handling complex tasks while protecting user privacy. Recent studies indicate that leveraging gradient information and prior knowledge can potentially reveal training samples within FL setting. However, these investigations have overlooked the potential privacy risks tied to the intrinsic architecture of the models. This paper presents a two-stage privacy attack strategy that targets the vulnerabilities in the architecture of contemporary LLMs, significantly enhancing attack performance by initially recovering certain feature directions as additional supervisory signals. Our comparative experiments demonstrate superior attack performance across various datasets and scenarios, highlighting the privacy leakage risk associated with the increasingly complex architectures of LLMs. We call for the community to recognize and address these potential privacy risks in designing LLMs.
- Federated learning for emoji prediction in a mobile keyboard. arXiv preprint arXiv:1906.04329, 2019.
- Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3):50–60, 2020.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems, 33:16937–16947, 2020.
- See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16337–16346, 2021.
- Gradient inversion with generative image prior. Advances in neural information processing systems, 34:29898–29908, 2021.
- Lamp: Extracting text from gradients with language model priors. Advances in Neural Information Processing Systems, 35:7641–7654, 2022.
- Recovering private text in federated learning of language models. Advances in Neural Information Processing Systems, 35:8130–8143, 2022.
- Deep leakage from gradients. Advances in neural information processing systems, 32, 2019.
- Tag: Gradient attack on transformer-based language models. arXiv preprint arXiv:2103.06819, 2021.
- Robbing the fed: Directly obtaining private data in federated learning with modified models. arXiv preprint arXiv:2110.13057, 2021.
- Decepticons: Corrupted transformers breach privacy in federated learning for language models. arXiv preprint arXiv:2201.12675, 2022.
- When the curious abandon honesty: Federated learning is not private. In 2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), pages 175–199. IEEE, 2023.
- Federated optimization:distributed optimization beyond the datacenter, 2015.
- Federated optimization: Distributed machine learning for on-device intelligence, 2016.
- Federated learning: Strategies for improving communication efficiency, 2017.
- Privacy-preserving deep learning: Revisited and enhanced. In Applications and Techniques in Information Security: 8th International Conference, ATIS 2017, Auckland, New Zealand, July 6–7, 2017, Proceedings, pages 100–110. Springer, 2017.
- idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020.
- R-gap: Recursive gradient attack on privacy. arXiv preprint arXiv:2010.07733, 2020.
- Evaluating gradient inversion attacks and defenses in federated learning. Advances in Neural Information Processing Systems, 34:7232–7241, 2021.
- Reconstructing training data from model gradient, provably. In International Conference on Artificial Intelligence and Statistics, pages 6595–6612. PMLR, 2023.
- Neural network acceptability judgments. arXiv preprint arXiv:1805.12471, 2018.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631–1642, 2013.
- Glue: A multi-task benchmark and analysis platform for natural language understanding, 2019.
- Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075, 2005.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
- Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Enhancing adversarial example transferability with an intermediate level attack. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4733–4742, 2019.
- Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis. In International Conference on Machine Learning, pages 15884–15899. PMLR, 2023.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.