Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 173 tok/s Pro
GPT OSS 120B 438 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Fundamental Limits of Membership Inference Attacks on Machine Learning Models (2310.13786v5)

Published 20 Oct 2023 in stat.ML, cs.AI, and cs.LG

Abstract: Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive information about individuals. This article provides theoretical guarantees by exploring the fundamental statistical limitations associated with MIAs on machine learning models at large. More precisely, we first derive the statistical quantity that governs the effectiveness and success of such attacks. We then theoretically prove that in a non-linear regression setting with overfitting learning procedures, attacks may have a high probability of success. Finally, we investigate several situations for which we provide bounds on this quantity of interest. Interestingly, our findings indicate that discretizing the data might enhance the learning procedure's security. Specifically, it is demonstrated to be limited by a constant, which quantifies the diversity of the underlying data distribution. We illustrate those results through simple simulations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Measuring unintended memorisation of unique private features in neural networks. arXiv preprint arXiv:2202.08099, 2022.
  2. Bounding information leakage in machine learning. Neurocomputing, 534:1–17, 2023.
  3. Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3–18. IEEE, 2017.
  4. Machine Learning Models That Remember Too Much. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS ’17, page 587–601, New York, NY, USA, 2017a. Association for Computing Machinery. ISBN 9781450349468. doi:10.1145/3133956.3134077. URL https://doi.org/10.1145/3133956.3134077.
  5. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning. In 2019 IEEE Symposium on Security and Privacy (SP), pages 739–753, 2019. doi:10.1109/SP.2019.00065.
  6. Deep leakage from gradients. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/60a6c4002cc7b29142def8871531281a-Paper.pdf.
  7. Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), pages 5253–5270, 2023a.
  8. A taxonomy and terminology of adversarial machine learning, 10 2019.
  9. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006.
  10. Evaluating Differentially Private Machine Learning in Practice. In Proceedings of the 28th USENIX Conference on Security Symposium, SEC’19, page 1895–1912, USA, 2019. USENIX Association. ISBN 9781939133069.
  11. Measuring Data Leakage in Machine-Learning Models with Fisher Information. In Conference on Uncertainty in Artificial Intelligence, 2021. URL https://api.semanticscholar.org/CorpusID:232013768.
  12. White-box vs black-box: Bayes optimal strategies for membership inference. In International Conference on Machine Learning, pages 5558–5567. PMLR, 2019.
  13. Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 11998–12011. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/guo23e.html.
  14. Bounding membership inference. arXiv preprint arXiv:2202.12232, 2022.
  15. Membership-Doctor: Comprehensive Assessment of Membership Inference Against Machine Learning Models. arXiv preprint arXiv:2208.10445, 2022.
  16. Provable Membership Inference Privacy. arXiv preprint arXiv:2211.06582, 2022.
  17. Membership inference attacks and generalization: A causal perspective. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 249–262, 2022.
  18. Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st computer security foundations symposium (CSF), pages 268–282. IEEE, 2018.
  19. Stability and Generalization. Journal of Machine Learning Research, 2(Mar):499–526, 2002. ISSN ISSN 1533-7928. URL http://www.jmlr.org/papers/v2/bousquet02a.html.
  20. On the difficulty of membership inference attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7892–7900, 2021.
  21. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1897–1914. IEEE, 2022a.
  22. Membership Inference Attacks on Deep Regression Models for Neuroimaging. In International Conference on Medical Imaging with Deep Learning, 2021. URL https://api.semanticscholar.org/CorpusID:233864706.
  23. Membership Inference Attacks Against Generative Models. 2018. URL https://api.semanticscholar.org/CorpusID:202588705.
  24. Information Leakage in Embedding Models. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, CCS ’20, page 377–390, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450370899. doi:10.1145/3372297.3417270. URL https://doi.org/10.1145/3372297.3417270.
  25. Membership inference attacks on machine learning: A survey. ACM Computing Surveys (CSUR), 54(11s):1–37, 2022.
  26. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246, 2018.
  27. Understanding Deep Learning (Still) Requires Rethinking Generalization. Commun. ACM, 64(3):107–115, feb 2021. ISSN 0001-0782. doi:10.1145/3446776. URL https://doi.org/10.1145/3446776.
  28. The Privacy Onion Effect: Memorization is Relative. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 13263–13276. Curran Associates, Inc., 2022b.
  29. The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX Security 19), pages 267–284, 2019.
  30. Stolen memories: Leveraging model memorization for calibrated {{\{{White-Box}}\}} membership inference. In 29th USENIX security symposium (USENIX Security 20), pages 1605–1622, 2020.
  31. Investigating the Impact of Pre-trained Word Embeddings on Memorization in Neural Networks. In Workshop on Time-Delay Systems, 2020. URL https://api.semanticscholar.org/CorpusID:220658693.
  32. Memorization without overfitting: Analyzing the training dynamics of large language models. Advances in Neural Information Processing Systems, 35:38274–38290, 2022.
  33. SK Murakonda and R Shokri. ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning., 2007.
  34. Machine learning models that remember too much. In Proceedings of the 2017 ACM SIGSAC Conference on computer and communications security, pages 587–601, 2017b.
  35. Vitaly Feldman. Does learning require memorization? a short tale about a long tail. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 954–959, 2020.
  36. Quantifying Memorization Across Neural Language Models. In The Eleventh International Conference on Learning Representations, 2023b. URL https://openreview.net/forum?id=TatRHT_1cK.
  37. Reconsidering Overfitting in the Age of Overparametrized Models.
  38. J. Ziv and M. Zakai. On Functionals Satisfying a Data-Processing Theorem. IEEE Trans. Inf. Theor., 19(3):275–283, may 1973. ISSN 0018-9448. doi:10.1109/TIT.1973.1055015. URL https://doi.org/10.1109/TIT.1973.1055015.
  39. Asymptotic development for the CLT in total variation distance. Bernoulli, 22(4):2442–2485, 2016. ISSN 1350-7265,1573-9759. doi:10.3150/15-BEJ734. URL https://doi.org/10.3150/15-BEJ734.
  40. The total variation distance between high-dimensional Gaussians with the same mean. arXiv preprint arXiv:1810.08693, 2018.
  41. A sharp estimate of the binomial mean absolute deviation with applications. Statist. Probab. Lett., 83(4):1254–1259, 2013. ISSN 0167-7152,1879-2103. doi:10.1016/j.spl.2013.01.023. URL https://doi.org/10.1016/j.spl.2013.01.023.
  42. Abraham De Moivre. Miscellanea Analytica de Seriebus et Quadraturis. J. Thonson and J. Watts, London, 1730.
  43. Herbert Robbins. A remark on Stirling’s formula. Amer. Math. Monthly, 62:26–29, 1955. ISSN 0002-9890,1930-0972. doi:10.2307/2308012. URL https://doi.org/10.2307/2308012.
  44. Gradient methods provably converge to non-robust networks. Advances in Neural Information Processing Systems, 35:20921–20932, 2022.
  45. Gradient descent maximizes the margin of homogeneous neural networks. arXiv preprint arXiv:1906.05890, 2019.
  46. Directional convergence and alignment in deep learning. Advances in Neural Information Processing Systems, 33:17176–17186, 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.