Are Normalizing Flows the Key to Unlocking the Exponential Mechanism? (2311.09200v4)
Abstract: The Exponential Mechanism (ExpM), designed for private optimization, has been historically sidelined from use on continuous sample spaces, as it requires sampling from a generally intractable density, and, to a lesser extent, bounding the sensitivity of the objective function. Any differential privacy (DP) mechanism can be instantiated as ExpM, and ExpM poses an elegant solution for private ML that bypasses inherent inefficiencies of DPSGD. This paper seeks to operationalize ExpM for private optimization and ML by using an auxiliary Normalizing Flow (NF), an expressive deep network for density learning, to approximately sample from ExpM density. The method, ExpM+NF is an alternative to SGD methods for model training. We prove a sensitivity bound for the $\ell2$ loss permitting ExpM use with any sampling method. To test feasibility, we present results on MIMIC-III health data comparing (non-private) SGD, DPSGD, and ExpM+NF training methods' accuracy and training time. We find that a model sampled from ExpM+NF is nearly as accurate as non-private SGD, more accurate than DPSGD, and ExpM+NF trains faster than Opacus' DPSGD implementation. Unable to provide a privacy proof for the NF approximation, we present empirical results to investigate privacy including the LiRA membership inference attack of Carlini et al. and the recent privacy auditing lower bound method of Steinke et al. Our findings suggest ExpM+NF provides more privacy than non-private SGD, but not as much as DPSGD, although many attacks are impotent against any model. Ancillary benefits of this work include pushing the SOTA of privacy and accuracy on MIMIC-III healthcare data, exhibiting the use of ExpM+NF for Bayesian inference, showing the limitations of empirical privacy auditing in practice, and providing several privacy theorems applicable to distribution learning.
- Abadi, M. et al. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016.
- Che, Z. et al. Recurrent neural networks for multivariate time series with missing values. Scientific reports, 8(1):1–12, 2018.
- Gaussian differential privacy. Journal of the Royal Statistical Society, 2021. URL https://par.nsf.gov/biblio/10215709.
- Dwork, C. Sequential composition of differential privacy. Theory of Cryptography, pp. 222–232, 2007.
- Dwork, C. Advanced composition of differential privacy. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, pp. 436–445, 2010.
- The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014.
- Data mining with differential privacy. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 493–502, 2010.
- Numerical composition of differential privacy. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, volume 34, pp. 11631–11642. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/6097d8f3714205740f30debe1166744e-Paper.pdf.
- Differential privacy and machine learning: a survey and review. arXiv preprint arXiv:1412.7584, 2014.
- Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Scientific data, 3, 2016.
- The composition theorem for differential privacy. IEEE Transactions on Information Theory, 63(6):4037–4049, 2017.
- On differentially private low rank approximation. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pp. 1395–1414. SIAM, 2013.
- What can we learn privately? SIAM Journal on Computing, 40(3):793–826, 2011. URL https://doi.org/10.1137/090756090.
- The expressive power of a class of normalizing flow models. In International conference on artificial intelligence and statistics, pp. 3599–3609. PMLR, 2020.
- Kurakin, A. Applying differential privacy to large scale image classification, 2022. URL https://ai.googleblog.com/2022/02/applying-differential-privacy-to-large.html. Accessed: 2023-07-25.
- A universal approximation theorem of deep neural networks for expressing probability distributions. Advances in neural information processing systems, 33:3094–3105, 2020.
- Mechanism design via differential privacy. In IEEE Symposium on Foundations of Computer Science, pp. 94–103, 2007. doi: 10.1109/FOCS.2007.66.
- Differential privacy without sensitivity. In Advances in Neural Information Processing Systems 29, 2016.
- Mironov, I. Rényi differential privacy. In 2017 IEEE 30th computer security foundations symposium (CSF), pp. 263–275. IEEE, 2017.
- Programming differential privacy, 2022. URL https://programming-dp.com/. Accessed: 2023-09-15.
- Papamakarios, G. et al. Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22(57):1–64, 2021.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- How to dp-fy ML: a practical guide to machine learning with differential privacy. volume 77, pp. 1113–1201, 2023.
- Variational inference with normalizing flows. In International Conference on Machine Learning, 2015.
- A basis-kernel representation of orthogonal matrices. SIAM journal on matrix analysis and applications, 16(4):1184–1196, 1995.
- Chasing your long tails: Differentially private prediction in health care settings. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 723–734, 2021.
- Van Den Berg, R. et al. Sylvester normalizing flows for variational inference. In 34th Conference on Uncertainty in Artificial Intelligence, 2018.
- On the expressivity of bi-lipschitz normalizing flows. In Khan, E. and Gonen, M. (eds.), Proceedings of The 14th Asian Conference on Machine Learning, volume 189 of Proceedings of Machine Learning Research, pp. 1054–1069. PMLR, 12–14 Dec 2023. URL https://proceedings.mlr.press/v189/verine23a.html.
- Vinterbo, S. A. Differentially private projected histograms: Construction and use for prediction. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 19–34. Springer, 2012.
- Wang, S. et al. MIMIC-extract github repository. URL https://github.com/MLforHealth/MIMIC_Extract. Accessed: 2023-05-25.
- Wang, S. et al. Mimic-extract: A data extraction, preprocessing, and representation pipeline for mimic-iii. In Proceedings of the ACM conference on health, inference, and learning, 2020. URL https://github.com/MLforHealth/MIMIC_Extract.
- A survey on differential privacy and applications. Jisuanji Xuebao/Chinese Journal of Computers, 37(1):101–122, 2014.
- Opacus: User-friendly differential privacy library in PyTorch. In NeurIPS Workshop Privacy in Machine Learning, 2021.
- Yousefpour, A. et al. Codebase Opacus. https://opacus.ai/. Accessed: 2023-07-25.