SPEAR:Exact Gradient Inversion of Batches in Federated Learning (2403.03945v3)
Abstract: Federated learning is a framework for collaborative machine learning where clients only share gradient updates and not their private data with a server. However, it was recently shown that gradient inversion attacks can reconstruct this data from the shared gradients. In the important honest-but-curious setting, existing attacks enable exact reconstruction only for batch size of $b=1$, with larger batches permitting only approximate reconstruction. In this work, we propose SPEAR, the first algorithm reconstructing whole batches with $b >1$ exactly. SPEAR combines insights into the explicit low-rank structure of gradients with a sampling-based algorithm. Crucially, we leverage ReLU-induced gradient sparsity to precisely filter out large numbers of incorrect samples, making a final reconstruction step tractable. We provide an efficient GPU implementation for fully connected networks and show that it recovers high-dimensional ImageNet inputs in batches of up to $b \lesssim 25$ exactly while scaling to large networks. Finally, we show theoretically that much larger batches can be reconstructed with high probability given exponential time.
- Deep learning with differential privacy. In CCS, 2016.
- K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing, 54(11):4311–4322, 2006.
- LAMP: extracting text from gradients with language model priors. In NeurIPS, 2022a.
- Bayesian framework for gradient leakage. In ICLR, 2022b.
- When the curious abandon honesty: Federated learning is not private. arXiv, 2021.
- Practical secure aggregation for federated learning on user-held data. NIPS, 2016.
- Chen, Z. H2 Maths Handbook. Educational Publishing House, 2011.
- Data leakage in federated averaging. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=e7A0B99zJf.
- Decepticons: Corrupted transformers breach privacy in federated learning for language models. ICLR, 2022a.
- Robbing the fed: Directly obtaining private data in federated learning with modified models. In ICLR, 2022b.
- Inverting gradients-how easy is it to break privacy in federated learning? NeurIPS, 2020.
- Towards general deep leakage in federated learning. arXiv, 2021.
- Recovering private text in federated learning of language models. Advances in Neural Information Processing Systems, 35:8130–8143, 2022.
- Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis. In International Conference on Machine Learning, pp. 15884–15899. PMLR, 2023.
- Learning multiple layers of features from tiny images. 2009.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7), 2015.
- Mnist handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010.
- Auditing privacy defenses in federated learning via generative gradient leakage. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10132–10142, 2022.
- Communication-efficient learning of deep networks from decentralized data. In AISTATS, 2017.
- Parameswaran, R. Statistics for experimenters: an introduction to design, data analysis, and model building. JMR, Journal of Marketing Research (pre-1986), 16(000002):291, 1979.
- Pytorch: An imperative style, high-performance deep learning library. In Wallach, H. M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E. B., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp. 8024–8035, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html.
- Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur., (5), 2018.
- Exact recovery of sparsely-used dictionaries. In Conference on Learning Theory, pp. 37–1. JMLR Workshop and Conference Proceedings, 2012.
- Complete dictionary recovery over the sphere i: Overview and the geometric picture. IEEE Transactions on Information Theory, 63(2):853–884, 2016.
- On random pm 1 matrices: singularity and determinant. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pp. 431–440, 2005.
- Tikhomirov, K. Singularity of random bernoulli matrices. Annals of Mathematics, 191(2):593–634, 2020.
- Tillmann, A. M. On the computational intractability of exact and approximate dictionary learning. IEEE Signal Processing Letters, 22(1):45–49, 2014.
- Data leakage in tabular federated learning. ICML, 2022.
- Fishing for user data in large-batch federated learning via gradient magnification. In ICML, 2022.
- See through gradients: Image batch recovery via gradinversion. In CVPR, 2021.
- idlg: Improved deep leakage from gradients. arXiv, 2020.
- R-GAP: recursive gradient attack on privacy. In ICLR, 2021.
- Deep leakage from gradients. In NeurIPS, 2019.