Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Closed-Form Bounds for DP-SGD against Record-level Inference (2402.14397v1)

Published 22 Feb 2024 in cs.CR and cs.LG

Abstract: Machine learning models trained with differentially-private (DP) algorithms such as DP-SGD enjoy resilience against a wide range of privacy attacks. Although it is possible to derive bounds for some attacks based solely on an $(\varepsilon,\delta)$-DP guarantee, meaningful bounds require a small enough privacy budget (i.e., injecting a large amount of noise), which results in a large loss in utility. This paper presents a new approach to evaluate the privacy of machine learning models against specific record-level threats, such as membership and attribute inference, without the indirection through DP. We focus on the popular DP-SGD algorithm, and derive simple closed-form bounds. Our proofs model DP-SGD as an information theoretic channel whose inputs are the secrets that an attacker wants to infer (e.g., membership of a data record) and whose outputs are the intermediate model parameters produced by iterative optimization. We obtain bounds for membership inference that match state-of-the-art techniques, whilst being orders of magnitude faster to compute. Additionally, we present a novel data-dependent bound against attribute inference. Our results provide a direct, interpretable, and practical way to evaluate the privacy of trained models against specific inference threats without sacrificing utility.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Deep learning with differential privacy. In 23rd ACM SIGSAC Conference on Computer and Communications Security, CCS 2016, pages 308–318. ACM, 2016. doi: 10.1145/2976749.2978318.
  2. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In International Conference on Machine Learning, pages 394–403. PMLR, 2018.
  3. On the importance of architecture and feature selection in differentially private machine learning. arXiv preprint arXiv:2205.06720, 2022.
  4. SS Barsov and Vladimir V Ul’yanov. Estimates of the proximity of Gaussian measures. Sov. Math., Dokl, 34:462–466, 1987.
  5. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1897–1914. IEEE, 2022.
  6. Bayes security: A not so average metric. In 2023 2023 IEEE 36th Computer Security Foundations Symposium (CSF) (CSF), pages 159–177. IEEE Computer Society, jul 2023. doi: 10.1109/CSF57540.2023.00011.
  7. Giovanni Cherubin. Bayes, not naïve: Security bounds on website fingerprinting defenses. Proceedings on Privacy Enhancing Technologies, 2017.
  8. Giovanni Cherubin. Black-box security: measuring black-box information leakage via machine learning. PhD thesis, Royal Holloway, University of London, 2019.
  9. F-BLEAU: fast black-box leakage estimation. In 2019 IEEE Symposium on Security and Privacy (SP), pages 835–852. IEEE, 2019.
  10. Characterizations of an empirical influence function for detecting influential cases in regression. Technometrics, 22(4):495–508, 1980.
  11. Thomas M Cover. Elements of information theory. John Wiley & Sons, 1999.
  12. The total variation distance between high-dimensional Gaussians with the same mean. arXiv preprint arXiv:1810.08693 [math.ST], 2018. doi: 10.48550/ARXIV.1810.08693.
  13. Gaussian differential privacy, 2019. URL https://arxiv.org/abs/1905.02383.
  14. Connect the dots: Tighter discrete approximations of privacy loss distributions, 2022a. URL https://arxiv.org/abs/2207.04380.
  15. Connect the dots: Tighter discrete approximations of privacy loss distributions. arXiv preprint arXiv:2207.04380, 2022b.
  16. Model inversion attacks that exploit confidence information and basic countermeasures. In ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 1322–1333. ACM, 2015. doi: 10.1145/2810103.2813677.
  17. Numerical composition of differential privacy. Advances in Neural Information Processing Systems, 34:11631–11642, 2021.
  18. Frank R Hampel. The influence curve and its role in robust estimation. Journal of the american statistical association, 69(346):383–393, 1974.
  19. Approximating the kullback leibler divergence between gaussian mixture models. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, volume 4, pages IV–317. IEEE, 2007.
  20. Investigating membership inference attacks under data dependencies. CoRR, abs/2010.12112v3, 2021.
  21. Efficient approximation algorithms for point-set diameter in higher dimensions. Journal of Algorithms and Computation, 51(2):47–61, 2019.
  22. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 193–204, 2011.
  23. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017.
  24. Computing differential privacy guarantees for heterogeneous compositions using fft, 2021. URL https://arxiv.org/abs/2102.12412.
  25. Tight differential privacy for discrete-valued mechanisms and for the subsampled gaussian mechanism using fft. 2020. doi: 10.48550/ARXIV.2006.07134. URL https://arxiv.org/abs/2006.07134.
  26. Membership privacy: A unifying framework for privacy definitions. In 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS, page 889–900. ACM, 2013. doi: 10.1145/2508859.2516686.
  27. Optimal membership inference bounds for adaptive composition of sampled gaussian mechanisms. arXiv preprint arXiv:2204.06106, 2022.
  28. R\\\backslash\’enyi differential privacy of the sampled gaussian mechanism. arXiv preprint arXiv:1908.10530, 2019.
  29. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy, S&P, pages 3–18. IEEE, 2017. doi: 10.1109/SP.2017.41.
  30. Geoffrey Smith. On the foundations of quantitative information flow. In International Conference on Foundations of Software Science and Computational Structures, pages 288–302. Springer, 2009.
  31. Privacy loss classes: The central limit theorem in differential privacy. Cryptology ePrint Archive, 2018.
  32. Subsampled rényi differential privacy and analytical moments accountant. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1226–1235. PMLR, 2019.
  33. Andrew Chi-Chih Yao. On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing, 11(4):721–736, 1982.
  34. Overfitting, robustness, and malicious algorithms: A study of potential causes of privacy risk in machine learning. Journal of Computer Security, 28(1):35–70, 2020. doi: 10.3233/JCS-191362.
  35. Opacus: User-friendly differential privacy library in PyTorch. arXiv preprint arXiv:2109.12298, 2021.
  36. Bayesian estimation of differential privacy. arXiv preprint arXiv:2206.05199, 2022.
  37. Attribute privacy: Framework and mechanisms. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 757–766, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Giovanni Cherubin (13 papers)
  2. Boris Köpf (20 papers)
  3. Andrew Paverd (33 papers)
  4. Shruti Tople (28 papers)
  5. Lukas Wutschitz (13 papers)
  6. Santiago Zanella-Béguelin (13 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com