Differentially private projection-depth-based medians (2312.07792v3)
Abstract: We develop $(\epsilon,\delta)$-differentially private projection-depth-based medians using the propose-test-release (PTR) and exponential mechanisms. Under general conditions on the input parameters and the population measure, (e.g. we do not assume any moment bounds), we quantify the probability the test in PTR fails, as well as the cost of privacy via finite sample deviation bounds. Next, we show that when some observations are contaminated, the private projection-depth-based median does not break down, provided its input location and scale estimators do not break down. We demonstrate our main results on the canonical projection-depth-based median, as well as on projection-depth-based medians derived from trimmed estimators. In the Gaussian setting, we show that the resulting deviation bound matches the known lower bound for private Gaussian mean estimation. In the Cauchy setting, we show that the ``outlier error amplification'' effect resulting from the heavy tails outweighs the cost of privacy. This result is then verified via numerical simulations. Additionally, we present results on general PTR mechanisms and a uniform concentration result on the projected spacings of order statistics, which may be of general interest.
- The 2020 Census Disclosure Avoidance System TopDown Algorithm. Harvard Data Science Review, (Special Issue 2). https://hdsr.mitpress.mit.edu/pub/7evz361i.
- On the sample complexity of privately learning unbounded high-dimensional gaussians. In Feldman, V., Ligett, K., and Sabato, S., editors, Proceedings of the 32nd International Conference on Algorithmic Learning Theory, volume 132 of Proceedings of Machine Learning Research, pages 185–216. PMLR.
- Differentially private sub-Gaussian location estimators. arXiv e-prints. arXiv:1906.11923.
- Archimedes Meets Privacy: On Privately Estimating Quantiles in High Dimensions Under Minimal Assumptions. arXiv e-prints, page arXiv:2208.07438.
- CoinPress: Practical private mean and covariance estimation. Advances in Neural Information Processing Systems, 33:14475–14485.
- Covariance-aware private mean estimation without private covariance estimation. Advances in Neural Information Processing Systems, 34:7950–7964.
- Propose, test, release: Differentially private estimation with high probability. arXiv e-prints. arXiv:2002.08774.
- The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. The Annals of Statistics, 49(5):2825 – 2850.
- Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics, 20(4):1803 – 1827.
- Differential privacy and robust statistics. Proceedings of the 41st annual ACM symposium on theory of computing - STOC ’09, page 371.
- The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4):211–407.
- Approximate computation of projection depths. Computational Statistics & Data Analysis, 157:107166.
- Fernandez-Rico, Z. (2022). Optimal statistical estimation: sub-Gaussian properties, heavy-tailed data, and robustness. PhD thesis, Instituto de Matemática Pura e Aplicada.
- Robust Statistics: The Approach Based on Influence Functions. Wiley.
- Efficient mean estimation with pure differential privacy via a sum-of-squares exponential mechanism. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 1406–1417.
- Privately learning high-dimensional distributions. arXiv e-prints. arXiv:1805.00216.
- Robust and differentially private mean estimation. arXiv e-prints. arXiv:2102.09159.
- Differential privacy and robust statistics in high dimensions. arXiv e-prints. arXiv:2111.06578.
- Pyke, R. (1965). Spacings. Journal of the Royal Statistical Society: Series B (Methodological), 27(3):395–436.
- Concentration of the exponential mechanism and differentially private multivariate medians.
- Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88(424):1273–1283.
- Exact privacy guarantees for markov chain implementations of the exponential mechanism with artificial atoms. Advances in Neural Information Processing Systems, 34:13125–13136.
- Talagrand, M. (1994). Sharper bounds for gaussian and empirical processes. The Annals of Probability, 22(1):28–76.
- Optimal private median estimation under minimal distributional assumptions. Advances in Neural Information Processing Systems, 33:3301–3311.
- Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press.
- Zuo, Y. (2003). Projection-based depth functions and associated medians. The Annals of Statistics, 31(5):1460–1490.
- Zuo, Y. (2004). Influence function and maximum bias of projection depth based estimators. Annals of Statistics, 32(1):189–218.
- General notions of statistical depth function. Annals of Statistics, 28(2):461–482.