PLAN: Variance-Aware Private Mean Estimation (2306.08745v3)
Abstract: Differentially private mean estimation is an important building block in privacy-preserving algorithms for data analysis and machine learning. Though the trade-off between privacy and utility is well understood in the worst case, many datasets exhibit structure that could potentially be exploited to yield better algorithms. In this paper we present $\textit{Private Limit Adapted Noise}$ (PLAN), a family of differentially private algorithms for mean estimation in the setting where inputs are independently sampled from a distribution $\mathcal{D}$ over $\mathbf{R}d$, with coordinate-wise standard deviations $\boldsymbol{\sigma} \in \mathbf{R}d$. Similar to mean estimation under Mahalanobis distance, PLAN tailors the shape of the noise to the shape of the data, but unlike previous algorithms the privacy budget is spent non-uniformly over the coordinates. Under a concentration assumption on $\mathcal{D}$, we show how to exploit skew in the vector $\boldsymbol{\sigma}$, obtaining a (zero-concentrated) differentially private mean estimate with $\ell_2$ error proportional to $|\boldsymbol{\sigma}|_1$. Previous work has either not taken $\boldsymbol{\sigma}$ into account, or measured error in Mahalanobis distance $\unicode{x2013}$ in both cases resulting in $\ell_2$ error proportional to $\sqrt{d}|\boldsymbol{\sigma}|_2$, which can be up to a factor $\sqrt{d}$ larger. To verify the effectiveness of PLAN, we empirically evaluate accuracy on both synthetic and real world data.
- Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (Vienna, Austria) (CCS ’16). Association for Computing Machinery, New York, NY, USA, 308–318. https://doi.org/10.1145/2976749.2978318
- Privately Estimating a Gaussian: Efficient, Robust and Optimal. arXiv:2212.08018 [cs, math, stat]
- Differentially Private Covariance Estimation. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.
- Differentially Private Learning with Adaptive Clipping. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 17455–17466.
- Hassan Ashtiani and Christopher Liaw. 2022. Private and Polynomial Time Algorithms for Learning Gaussians and Beyond. In Proceedings of Thirty Fifth Conference on Learning Theory. PMLR, 1075–1076.
- Hilal Asi and John C Duchi. 2020. Instance-Optimality in Differential Privacy via Approximate Inverse Sensitivity Mechanisms. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 14106–14117.
- A Discrete Choice Model for Subset Selection. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, Marina Del Rey CA USA, 37–45. https://doi.org/10.1145/3159652.3159702
- CoinPress: Practical Private Mean and Covariance Estimation. Advances in Neural Information Processing Systems 33 (2020), 14475–14485.
- Covariance-Aware Private Mean Estimation Without Private Covariance Estimation. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 7950–7964. https://proceedings.neurips.cc/paper/2021/hash/42778ef0b5805a96f9511e20b5611fce-Abstract.html
- Fast, Sample-Efficient, Affine-Invariant Private Mean and Covariance Estimation for Subgaussian Distributions. https://doi.org/10.48550/arXiv.2301.12250 arXiv:2301.12250 [cs]
- Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger. arXiv:2206.07136 [cs]
- Differentially Private Release and Learning of Threshold Functions. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science. 634–649. https://doi.org/10.1109/FOCS.2015.45
- Mark Bun and Thomas Steinke. 2016. Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds. In Theory of Cryptography (Lecture Notes in Computer Science), Martin Hirt and Adam Smith (Eds.). Springer, Berlin, Heidelberg, 635–658. https://doi.org/10.1007/978-3-662-53641-4_24
- The Cost of Privacy: Optimal Rates of Convergence for Parameter Estimation with Differential Privacy. The Annals of Statistics 49, 5 (Oct. 2021), 2825–2850. https://doi.org/10.1214/21-AOS2058
- Mean Estimation with User-level Privacy under Data Heterogeneity. Advances in Neural Information Processing Systems 35 (Dec. 2022), 29139–29151.
- Improved Differential Privacy for SGD via Optimal Private Linear Operators on Adaptive Streams. Advances in Neural Information Processing Systems (2022).
- Differentially Private Confidence Intervals. https://doi.org/10.48550/arXiv.2001.02285 arXiv:2001.02285 [cs, stat]
- Devdatt P. Dubhashi and Alessandro Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press.
- A Fast Algorithm for Adaptive Private Mean Estimation. arXiv:2301.07078 [cs, stat]
- Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography (Lecture Notes in Computer Science), Shai Halevi and Tal Rabin (Eds.). Springer Berlin Heidelberg, 265–284.
- Analyze Gauss: Optimal Bounds for Privacy-Preserving Principal Component Analysis. In Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing (STOC ’14). Association for Computing Machinery, New York, NY, USA, 11–20. https://doi.org/10.1145/2591796.2591883
- Moritz Hardt and Eric Price. 2014. The Noisy Power Method: A Meta Algorithm with Applications. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc.
- Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2022). Association for Computing Machinery, New York, NY, USA, 1406–1417. https://doi.org/10.1145/3519935.3519947
- Instance-optimal Mean Estimation Under Differential Privacy. Advances in Neural Information Processing Systems 34 (2021), 25993–26004. https://proceedings.neurips.cc/paper/2021/file/da54dd5a0398011cdfa50d559c2c0ef8-Paper.pdf
- Privately Learning High-Dimensional Distributions. In Proceedings of the Thirty-Second Conference on Learning Theory. PMLR, 1853–1902.
- A Bias-Variance-Privacy Trilemma for Statistical Estimation. https://doi.org/10.48550/arXiv.2301.13334 arXiv:2301.13334 [cs, math, stat]
- A Private and Computationally-Efficient Estimator for Unbounded Gaussians. In Proceedings of Thirty Fifth Conference on Learning Theory. PMLR, 544–572.
- Private Mean Estimation of Heavy-Tailed Distributions. In Proceedings of Thirty Third Conference on Learning Theory. PMLR, 2204–2235.
- Differentially Private Approximate Quantiles. In Proceedings of the 39th International Conference on Machine Learning. PMLR, 10751–10761. https://proceedings.mlr.press/v162/kaplan22a.html ISSN: 2640-3498.
- Vishesh Karwa and Salil Vadhan. 2018. Finite Sample Differentially Private Confidence Intervals. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH, Wadern/Saarbruecken, Germany, 9 pages. https://doi.org/10.4230/LIPICS.ITCS.2018.44
- Private Robust Estimation by Stabilizing Convex Relaxations. In Proceedings of Thirty Fifth Conference on Learning Theory. PMLR, 723–777.
- Multi-Task Differential Privacy Under Distribution Skew. arXiv:2302.07975 [cs, stat]
- Learning Differentially Private Recurrent Language Models. https://doi.org/10.48550/arXiv.1710.06963 arXiv:1710.06963 [cs]
- Michael Mitzenmacher and Eli Upfal. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press.
- AdaCliP: Adaptive Clipping for Private SGD. https://doi.org/10.48550/arXiv.1908.07643 arXiv:1908.07643 [cs, stat]
- Vikrant Singhal and Thomas Steinke. 2021. Privately Learning Subspaces. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 1312–1324.
- Adam Smith. 2011. Privacy-Preserving Statistical Estimation with Optimal Convergence Rates. In Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing (STOC ’11). Association for Computing Machinery, New York, NY, USA, 813–822. https://doi.org/10.1145/1993636.1993743
- Andreas Winkelbauer. 2012. Moments and absolute moments of the normal distribution. arXiv preprint arXiv:1209.4340 (2012).