Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Defending Against Data Reconstruction Attacks in Federated Learning: An Information Theory Approach (2403.01268v1)

Published 2 Mar 2024 in cs.LG, cs.CR, and cs.DC

Abstract: Federated Learning (FL) trains a black-box and high-dimensional model among different clients by exchanging parameters instead of direct data sharing, which mitigates the privacy leak incurred by machine learning. However, FL still suffers from membership inference attacks (MIA) or data reconstruction attacks (DRA). In particular, an attacker can extract the information from local datasets by constructing DRA, which cannot be effectively throttled by existing techniques, e.g., Differential Privacy (DP). In this paper, we aim to ensure a strong privacy guarantee for FL under DRA. We prove that reconstruction errors under DRA are constrained by the information acquired by an attacker, which means that constraining the transmitted information can effectively throttle DRA. To quantify the information leakage incurred by FL, we establish a channel model, which depends on the upper bound of joint mutual information between the local dataset and multiple transmitted parameters. Moreover, the channel model indicates that the transmitted information can be constrained through data space operation, which can improve training efficiency and the model accuracy under constrained information. According to the channel model, we propose algorithms to constrain the information transmitted in a single round of local training. With a limited number of training rounds, the algorithms ensure that the total amount of transmitted information is limited. Furthermore, our channel model can be applied to various privacy-enhancing techniques (such as DP) to enhance privacy guarantees against DRA. Extensive experiments with real-world datasets validate the effectiveness of our methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Deep learning with differential privacy. In Proc. of CCS, 2016.
  2. Information-theoretic perspective of federated learning. CoRR, abs/1911.07652, 2019.
  3. cpsgd: Communication-efficient and differentially-private distributed SGD. In Proc. NeurIPS, 2018.
  4. Differential privacy: On the trade-off between utility and information leakage. In Proc. FAST, 2011.
  5. The Science of Quantitative Information Flow. Information Security and Cryptography. Springer, 2020.
  6. Differentially private learning with adaptive clipping. In Proc. NeurIPS, pages 17455–17466, 2021.
  7. Reconstructing training data with informed adversaries. In Proc. S&P, 2022.
  8. Information-theoretic bounds for differentially private mechanisms. In Proc. CSF, 2011.
  9. Automatic clipping: Differentially private deep learning made easier and stronger. CoRR, abs/2206.07136, 2022.
  10. Membership inference attacks from first principles. In Proc. S&P, 2022.
  11. Extracting training data from large language models. In Proc. USENIX Security, 2021.
  12. Feddef: Defense against gradient leakage in federated learning-based network intrusion detection systems. IEEE TIFS, 2023.
  13. Label-only membership inference attacks. In Proc. ICML, 2021.
  14. Graham Cormode. Personal privacy vs population privacy: learning to attack anonymization. In Proc. SIGKDD, 2011.
  15. Elements of information theory (2. ed.). Wiley, 2006.
  16. Differential privacy as a mutual information constraint. In Proc. SIGSAC, 2016.
  17. Confidence-ranked reconstruction of census microdata from published statistics. The National Academy of Sciences, 2023.
  18. Cynthia Dwork. Differential privacy. In Proc. ICALP, 2006.
  19. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 2014.
  20. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of SIGSAC on computer and communications security, pages 1322–1333, 2015.
  21. Inverting gradients - how easy is it to break privacy in federated learning? In Proc. NeurIPS, 2020.
  22. Bounding training data reconstruction in private (deep) learning. In Proc. ICML, Proceedings of Machine Learning Research, 2022.
  23. Entropy bounds and statistical tests. In Proc. on the NIST Random Bit Generation, 2012.
  24. Reconstructing training data from trained neural networks. In Proc. NeurIPS, 2022.
  25. Measuring data leakage in machine-learning models with fisher information (extended abstract). In Proc. IJCAI, 2022.
  26. Bounding training data reconstruction in DP-SGD. CoRR, abs/2302.07225, 2023.
  27. Deep residual learning for image recognition. In Proc. CVPR, 2016.
  28. Deep models under the GAN: information leakage from collaborative deep learning. In Proc. CCS, 2017.
  29. Learning deep representations by mutual information estimation and maximization. In Proc. ICLR, 2019.
  30. On the efficient estimation of min-entropy. IEEE TIFS, 2021.
  31. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009.
  32. Imagenet classification with deep convolutional neural networks. In Proc. NeurIPS, 2012.
  33. MSU Graphic & Media laboratory. Ways of cheating on popular objective metrics: blurring, noise, super-resolution and others, 2021. https://videoprocessing.ai/metrics/ways-of-cheating-on-popular-objective-metrics.html.
  34. Gradient-based learning applied to document recognition. Proc. IEEE, 1998.
  35. Privacy accounting and quality control in the sage differentially private ML platform. In Proc. SOSP, 2019.
  36. Learning with user-level privacy. In Proc. NeurIPS, 2021.
  37. TIPRDC: task-independent privacy-respecting data crowdsourcing framework for deep learning with anonymized intermediate representations. In Proc. KDD, 2020.
  38. Deep gradient compression: Reducing the communication bandwidth for distributed training. In Proc. ICLR, 2018.
  39. Ml-doctor: Holistic risk assessment of inference attacks against machine learning models. In Proc. USENIX Security, 2022.
  40. Learning discrete distributions: user vs item-level privacy. In Proc. NeurIPS, 2020.
  41. Deep learning face attributes in the wild. In Proc. ICCV, 2015.
  42. A pragmatic approach to membership inferences on machine learning models. In Proc. EuroS&P, 2020.
  43. Communication-efficient learning of deep networks from decentralized data. In Proc. AISTATS, 2017.
  44. Exploiting unintended feature leakage in collaborative learning. In Proc. S&P, 2019.
  45. Ilya Mironov. Rényi differential privacy. In Proc. CSF, 2017.
  46. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In Proc. S&P, 2019.
  47. Reading digits in natural images with unsupervised feature learning. Proc. NeurIPS, 2011.
  48. Deep private-feature extraction. IEEE Trans. Knowl. Data Eng., 2020.
  49. Estimating g-leakage via machine learning. In Proc. CCS, 2020.
  50. White-box vs black-box: Bayes optimal strategies for membership inference. In Proc. ICML, 2019.
  51. Opening the black box of deep neural networks via information. CoRR, abs/1703.00810, 2017.
  52. Very deep convolutional networks for large-scale image recognition. In Yoshua Bengio and Yann LeCun, editors, Proc. ICLR, 2015.
  53. Geoffrey Smith. On the foundations of quantitative information flow. In Luca de Alfaro, editor, Proc. FOSSACS, 2009.
  54. Splitfed: When federated learning meets split learning. In Proc. AAAI, 2022.
  55. Ldp-fed: federated learning with local differential privacy. In Proc. on Edge Systems, Analytics and Networking, 2020.
  56. Variance-based gradient compression for efficient distributed deep learning. In Proc. ICLR, 2018.
  57. Mutual information driven federated learning. IEEE TPDS, 2021.
  58. Larry Wasserman. All of statistics: a concise course in statistical inference, volume 26. Springer, 2004.
  59. Differentially private learning with per-sample adaptive clipping. CoRR, abs/2212.00328, 2022.
  60. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. CoRR, abs/1708.07747, 2017.
  61. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol., 2019.
  62. A digital mask to safeguard patient privacy. Nature medicine, 2022.
  63. Opacus: User-friendly differential privacy library in pytorch. CoRR, abs/2109.12298, 2021.
  64. Collaboration-enabled intelligent internet architecture: Opportunities and challenges. IEEE Netw., 2022.
  65. Fedpage: Pruning adaptively toward global efficiency of heterogeneous federated learning. IEEE/ACM Transactions on Networking, 2023.
  66. Deep leakage from gradients. In Proc. NeurIPS, 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com