Leak and Learn: An Attacker's Cookbook to Train Using Leaked Data from Federated Learning (2403.18144v1)
Abstract: Federated learning is a decentralized learning paradigm introduced to preserve privacy of client data. Despite this, prior work has shown that an attacker at the server can still reconstruct the private training data using only the client updates. These attacks are known as data reconstruction attacks and fall into two major categories: gradient inversion (GI) and linear layer leakage attacks (LLL). However, despite demonstrating the effectiveness of these attacks in breaching privacy, prior work has not investigated the usefulness of the reconstructed data for downstream tasks. In this work, we explore data reconstruction attacks through the lens of training and improving models with leaked data. We demonstrate the effectiveness of both GI and LLL attacks in maliciously training models using the leaked data more accurately than a benign federated learning strategy. Counter-intuitively, this bump in training quality can occur despite limited reconstruction quality or a small total number of leaked images. Finally, we show the limitations of these attacks for downstream training, individually for GI attacks and for LLL attacks.
- When the curious abandon honesty: Federated learning is not private. 8th IEEE European Symposium on Security and Privacy (IEEE Euro S&P), 2023.
- Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1175–1191, 2017.
- Label-only membership inference attacks. In International conference on machine learning, pages 1964–1974. PMLR, 2021.
- Rethinking privacy preserving deep learning: How to evaluate and thwart privacy attacks. In Federated Learning, pages 32–50. Springer, 2020.
- Robbing the fed: Directly obtaining private data in federated learning with modified models. In International Conference on Learning Representations, 2022.
- Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems, 33:16937–16947, 2020.
- Towards general deep leakage in federated learning. arXiv preprint arXiv:2110.09074, 2021.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Deep models under the gan: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 603–618, 2017.
- Evaluating gradient inversion attacks and defenses in federated learning. Advances in Neural Information Processing Systems, 34:7232–7241, 2021.
- Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, 2009.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
- Yann LeCun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
- Comatch: Semi-supervised learning with contrastive graph regularization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9475–9484, 2021.
- Feature inference attack on model predictions in vertical federated learning. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 181–192. IEEE, 2021.
- Instance-wise batch label restoration via gradients in federated learning. In The Eleventh International Conference on Learning Representations, 2022.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE symposium on security and privacy (SP), pages 691–706. IEEE, 2019.
- Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In 2019 IEEE symposium on security and privacy (SP), pages 739–753. IEEE, 2019.
- Eluding secure aggregation in federated learning via model inconsistency. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 2429–2443, 2022.
- Privacy-preserving deep learning: Revisited and enhanced. In International Conference on Applications and Techniques in Information Security, pages 100–110. Springer, 2017.
- Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3–18. IEEE, 2017.
- Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pages 2512–2520. IEEE, 2019.
- Fishing for user data in large-batch federated learning via gradient magnification. International Conference on Machine Learning, 2022.
- Class-aware contrastive semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14421–14430, 2022.
- See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16337–16346, 2021.
- Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
- idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020.
- The resource problem of using linear layer leakage attack in federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3974–3983, 2023a.
- Loki: Large-scale data reconstruction attack against federated learning through model manipulation. In 2024 IEEE Symposium on Security and Privacy (SP), pages 30–30. IEEE Computer Society, 2023b.
- Simmatch: Semi-supervised learning with similarity matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14471–14481, 2022.
- Deep leakage from gradients. Advances in neural information processing systems, 32, 2019.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.