2000 character limit reached
PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining (2402.09477v2)
Published 12 Feb 2024 in cs.CR and cs.LG
Abstract: We present PANORAMIA, a privacy leakage measurement framework for machine learning models that relies on membership inference attacks using generated data as non-members. By relying on generated non-member data, PANORAMIA eliminates the common dependency of privacy measurement tools on in-distribution non-member data. As a result, PANORAMIA does not modify the model, training data, or training process, and only requires access to a subset of the training data. We evaluate PANORAMIA on ML models for image and tabular data classification, as well as on large-scale LLMs.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp. 308–318, 2016.
- One-shot empirical privacy estimation for federated learning, 2023.
- Reconstructing training data with informed adversaries, 2022.
- Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20.
- The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX Security 19), pp. 267–284, 2019.
- Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), 2021.
- Membership inference attacks from first principles. In 2022 2022 IEEE Symposium on Security and Privacy (SP) (SP), pp. 1519–1519, Los Alamitos, CA, USA, may 2022a. IEEE Computer Society. doi: 10.1109/SP46214.2022.00090. URL https://doi.ieeecomputersociety.org/10.1109/SP46214.2022.00090.
- Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pp. 1897–1914. IEEE, 2022b.
- Label-only membership inference attacks, 2020. URL https://arxiv.org/abs/2007.14321.
- Gaussian differential privacy. arXiv preprint arXiv:1905.02383, 2019.
- Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 2006.
- Deep residual learning for image recognition, 2015.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
- Auditing differentially private machine learning: How private is private sgd? Advances in Neural Information Processing Systems, 33:22205–22216, 2020.
- Evaluating differentially private machine learning in practice. In USENIX Security Symposium, 2019.
- The composition theorem for differential privacy. In International conference on machine learning. PMLR, 2015.
- Training generative adversarial networks with limited data, 2020.
- Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
- Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
- A general framework for auditing differentially private machine learning, 2023.
- Canife: Crafting canaries for empirical privacy measurement in federated learning, 2023.
- Winning the nist contest: A scalable and general approach to differentially private synthetic data. arXiv preprint arXiv:2108.04978, 2021.
- Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843, 2016.
- Adversary instantiation: Lower bounds for differentially private machine learning. In 2021 IEEE Symposium on Security and Privacy (SP), pp. 866–882, 2021. doi: 10.1109/SP40001.2021.00069.
- Tight auditing of differentially private machine learning, 2023.
- An introduction to convolutional neural networks, 2015.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Membership inference attacks against machine learning models, 2017.
- Systematic evaluation of privacy risks of machine learning models, 2020. URL https://arxiv.org/abs/2003.10595.
- Privacy auditing with one (1) training run, 2023.
- A statistical framework for differential privacy. Journal of the American Statistical Association, 2010.
- Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st computer security foundations symposium (CSF), pp. 268–282. IEEE, 2018.
- Opacus: User-friendly differential privacy library in PyTorch. arXiv preprint arXiv:2109.12298, 2021.
- Wide residual networks, 2017.
- Bayesian estimation of differential privacy, 2022.