Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

About the Cost of Central Privacy in Density Estimation (2306.14535v4)

Published 26 Jun 2023 in cs.AI, math.ST, and stat.TH

Abstract: We study non-parametric density estimation for densities in Lipschitz and Sobolev spaces, and under central privacy. In particular, we investigate regimes where the privacy budget is not supposed to be constant. We consider the classical definition of central differential privacy, but also the more recent notion of central concentrated differential privacy. We recover the result of Barber and Duchi (2014) stating that histogram estimators are optimal against Lipschitz distributions for the L2 risk, and under regular differential privacy, and we extend it to other norms and notions of privacy. Then, we investigate higher degrees of smoothness, drawing two conclusions: First, and contrary to what happens with constant privacy budget (Wasserman and Zhou, 2010), there are regimes where imposing privacy degrades the regular minimax risk of estimation on Sobolev densities. Second, so-called projection estimators are near-optimal against the same classes of densities in this new setup with pure differential privacy, but contrary to the constant privacy budget case, it comes at the cost of relaxation. With zero concentrated differential privacy, there is no need for relaxation, and we prove that the estimation is optimal.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Deep learning with differential privacy. In Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers, and Shai Halevi (eds.), Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016, pp.  308–318. ACM, 2016. doi: 10.1145/2976749.2978318. URL https://doi.org/10.1145/2976749.2978318.
  2. John M Abowd. The us census bureau adopts differential privacy. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.  2867–2867, 2018.
  3. Differentially private testing of identity and closeness of discrete distributions. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pp.  6879–6891, 2018. URL https://proceedings.neurips.cc/paper/2018/hash/7de32147a4f1055bed9e4faf3485a84d-Abstract.html.
  4. Optimal rates for nonparametric density estimation under communication constraints. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, volume 34, pp.  26754–26766. Curran Associates, Inc., 2021a. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/e1021d43911ca2c1845910d84f40aeae-Paper.pdf.
  5. Information-constrained optimization: can adaptive processing of gradients help? CoRR, abs/2104.00979, 2021b. URL https://arxiv.org/abs/2104.00979.
  6. Unified lower bounds for interactive high-dimensional estimation under information constraints. CoRR, abs/2010.06562, 2021c. URL https://arxiv.org/abs/2010.06562.
  7. Inference under information constraints iii: Local privacy constraints. IEEE Journal on Selected Areas in Information Theory, 2(1):253–267, 2021d. doi: 10.1109/JSAIT.2021.3053569. URL https://doi.org/10.1109/JSAIT.2021.3053569.
  8. Differentially private assouad, fano, and le cam. In Vitaly Feldman, Katrina Ligett, and Sivan Sabato (eds.), Algorithmic Learning Theory, 16-19 March 2021, Virtual Conference, Worldwide, volume 132 of Proceedings of Machine Learning Research, pp.  48–78. PMLR, 2021e. URL http://proceedings.mlr.press/v132/acharya21a.html.
  9. Near instance-optimality in differential privacy. CoRR, abs/2005.10630, 2020a. URL https://arxiv.org/abs/2005.10630.
  10. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020b. URL https://proceedings.neurips.cc/paper/2020/hash/a267f936e54d7c10a2bb70dbe6ad7a89-Abstract.html.
  11. From robustness to privacy and back. CoRR, abs/2302.01855, 2023. doi: 10.48550/arXiv.2302.01855. URL https://doi.org/10.48550/arXiv.2302.01855.
  12. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In Carey L. Williamson, Mary Ellen Zurko, Peter F. Patel-Schneider, and Prashant J. Shenoy (eds.), Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8-12, 2007, pp.  181–190. ACM, 2007. doi: 10.1145/1242572.1242598. URL https://doi.org/10.1145/1242572.1242598.
  13. Privacy and statistical risk: Formalisms and minimax bounds, 2014.
  14. Fisher information for distributed estimation under a blackboard communication protocol. In 2019 IEEE International Symposium on Information Theory (ISIT), pp.  2704–2708, 2019. doi: 10.1109/ISIT.2019.8849821.
  15. Lower bounds for learning distributions under communication constraints via fisher information. Journal of Machine Learning Research, 21:Paper No. 236, 30, 2020. ISSN 1532-4435. URL https://jmlr.csail.mit.edu/papers/volume21/19-737/19-737.pdf.
  16. Strongly universally consistent nonparametric regression and classification with privatised data. Electronic Journal of Statistics, 15(1):2430 – 2453, 2021. doi: 10.1214/21-EJS1845. URL https://doi.org/10.1214/21-EJS1845.
  17. Coinpress: Practical private mean and covariance estimation. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/a684eceee76fc522773286a895bc8436-Abstract.html.
  18. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Martin Hirt and Adam D. Smith (eds.), Theory of Cryptography - 14th International Conference, TCC 2016-B, Beijing, China, October 31 - November 3, 2016, Proceedings, Part I, volume 9985 of Lecture Notes in Computer Science, pp.  635–658, 2016. doi: 10.1007/978-3-662-53641-4_24. URL https://doi.org/10.1007/978-3-662-53641-4_24.
  19. Local differential privacy: Elbow effect in optimal density estimation and adaptation over besov ellipsoids. CoRR, abs/1903.01927, 2019. URL http://arxiv.org/abs/1903.01927.
  20. Collecting telemetry data privately. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp.  3571–3580, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/253614bbac999b38b5b60cae531c4969-Abstract.html.
  21. Revealing information while preserving privacy. In Frank Neven, Catriel Beeri, and Tova Milo (eds.), Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 9-12, 2003, San Diego, CA, USA, pp.  202–210. ACM, 2003. doi: 10.1145/773153.773173. URL https://doi.org/10.1145/773153.773173.
  22. Local privacy and statistical minimax rates. In 51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013, Allerton Park & Retreat Center, Monticello, IL, USA, October 2-4, 2013, pp.  1592. IEEE, 2013. doi: 10.1109/Allerton.2013.6736718. URL https://doi.org/10.1109/Allerton.2013.6736718.
  23. Local privacy, data processing inequalities, and statistical minimax rates, 2014. URL https://arxiv.org/abs/1302.3203.
  24. Minimax optimal procedures for locally private estimation. CoRR, abs/1604.02390, 2016. URL http://arxiv.org/abs/1604.02390.
  25. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014. doi: 10.1561/0400000042. URL https://doi.org/10.1561/0400000042.
  26. Concentrated differential privacy. arXiv preprint arXiv:1603.01887, 2016.
  27. Our data, ourselves: Privacy via distributed noise generation. In Serge Vaudenay (ed.), Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28 - June 1, 2006, Proceedings, volume 4004 of Lecture Notes in Computer Science, pp. 486–503. Springer, 2006a. doi: 10.1007/11761679_29. URL https://doi.org/10.1007/11761679_29.
  28. Calibrating noise to sensitivity in private data analysis. In Shai Halevi and Tal Rabin (eds.), Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006, Proceedings, volume 3876 of Lecture Notes in Computer Science, pp.  265–284. Springer, 2006b. doi: 10.1007/11681878_14. URL https://doi.org/10.1007/11681878_14.
  29. RAPPOR: randomized aggregatable privacy-preserving ordinal response. In Gail-Joon Ahn, Moti Yung, and Ninghui Li (eds.), Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AZ, USA, November 3-7, 2014, pp. 1054–1067. ACM, 2014. doi: 10.1145/2660267.2660348. URL https://doi.org/10.1145/2660267.2660348.
  30. Model inversion attacks that exploit confidence information and basic countermeasures. In Indrajit Ray, Ninghui Li, and Christopher Kruegel (eds.), Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12-16, 2015, pp. 1322–1333. ACM, 2015. doi: 10.1145/2810103.2813677. URL https://doi.org/10.1145/2810103.2813677.
  31. Christophe Giraud. Introduction to high-dimensional statistics. Chapman and Hall/CRC, 2021. ISBN 9781003158745. doi: 10.1201/9781003158745.
  32. Sparsity in neural networks can improve their privacy, 2023.
  33. On rate optimal private regression under local differential privacy. arXiv preprint arXiv:2206.00114, 2022.
  34. A Distribution-Free Theory of Nonparametric Regression. Springer series in statistics. Springer, 2002. ISBN 978-0-387-95441-7. doi: 10.1007/b97848. URL https://doi.org/10.1007/b97848.
  35. Multivariate density estimation from privatised data: universal consistency and minimax rates. Journal of Nonparametric Statistics, 0(0):1–23, 2023. doi: 10.1080/10485252.2022.2163634. URL https://doi.org/10.1080/10485252.2022.2163634.
  36. Differential privacy for functions and functional data. J. Mach. Learn. Res., 14(1):703–727, 2013. doi: 10.5555/2567709.2502603. URL https://dl.acm.org/doi/10.5555/2567709.2502603.
  37. Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS Genet, 4(8):e1000167, 2008.
  38. Privately learning high-dimensional distributions. In Alina Beygelzimer and Daniel Hsu (eds.), Conference on Learning Theory, COLT 2019, 25-28 June 2019, Phoenix, AZ, USA, volume 99 of Proceedings of Machine Learning Research, pp.  1853–1902. PMLR, 2019. URL http://proceedings.mlr.press/v99/kamath19a.html.
  39. Improved rates for differentially private stochastic convex optimization with heavy-tailed data. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  10633–10660. PMLR, 2022. URL https://proceedings.mlr.press/v162/kamath22a.html.
  40. A bias-variance-privacy trilemma for statistical estimation, 2023.
  41. Finite sample differentially private confidence intervals. In Anna R. Karlin (ed.), 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11-14, 2018, Cambridge, MA, USA, volume 94 of LIPIcs, pp.  44:1–44:9. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. doi: 10.4230/LIPIcs.ITCS.2018.44. URL https://doi.org/10.4230/LIPIcs.ITCS.2018.44.
  42. Martin Kroll. On density estimation at a fixed point under local differential privacy. Electronic Journal of Statistics, 15(1):1783 – 1813, 2021. doi: 10.1214/21-EJS1830. URL https://doi.org/10.1214/21-EJS1830.
  43. Private quantiles estimation in the presence of atoms. CoRR, abs/2202.08969, 2022. URL https://arxiv.org/abs/2202.08969.
  44. On the Statistical Complexity of Estimation and Testing under Privacy Constraints. Transactions on Machine Learning Research Journal, April 2023a. URL https://hal.science/hal-03794374v2.
  45. Private Statistical Estimation of Many Quantiles. In ICML 2023 - 40th International Conference on Machine Learning, Honolulu, United States, July 2023b. URL https://hal.science/hal-03986170.
  46. Minimax optimal goodness-of-fit testing for densities and multinomials under a local differential privacy constraint. Bernoulli, 28(1):579–600, 2022.
  47. The disclosure of diagnosis codes can breach research participants’ privacy. J. Am. Medical Informatics Assoc., 17(3):322–327, 2010. doi: 10.1136/jamia.2009.002725. URL https://doi.org/10.1136/jamia.2009.002725.
  48. Ilya Mironov. Rényi differential privacy. In 30th IEEE Computer Security Foundations Symposium, CSF 2017, Santa Barbara, CA, USA, August 21-25, 2017, pp.  263–275. IEEE Computer Society, 2017. doi: 10.1109/CSF.2017.11. URL https://doi.org/10.1109/CSF.2017.11.
  49. How to break anonymity of the netflix prize dataset. CoRR, abs/cs/0610105, 2006. URL http://arxiv.org/abs/cs/0610105.
  50. Robust de-anonymization of large sparse datasets. In 2008 IEEE Symposium on Security and Privacy (S&P 2008), 18-21 May 2008, Oakland, California, USA, pp.  111–125. IEEE Computer Society, 2008. doi: 10.1109/SP.2008.33. URL https://doi.org/10.1109/SP.2008.33.
  51. High dimensional statistics. MIT lecture notes for course 18S997, 2015. URL https://math.mit.edu/~rigollet/PDFs/RigNotes17.pdf.
  52. Adaptive pointwise density estimation under local differential privacy, 2022.
  53. Vikrant Singhal. A polynomial time, pure differentially private estimator for binary product distributions, 2023.
  54. Latanya Sweeney. Simple demographics often identify people uniquely. Health (San Francisco), 671(2000):1–34, 2000.
  55. Latanya Sweeney. k-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst., 10(5):557–570, 2002. doi: 10.1142/S0218488502001648. URL https://doi.org/10.1142/S0218488502001648.
  56. Learning new words. Granted US Patents, 9594741, 2017.
  57. Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer series in statistics. Springer, 2009. ISBN 978-0-387-79051-0. doi: 10.1007/b13794. URL https://doi.org/10.1007/b13794.
  58. A. W. Van der Vaart. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998. doi: 10.1017/CBO9780511802256.
  59. Tim van Erven and Peter Harremoës. Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theory, 60(7):3797–3820, 2014. doi: 10.1109/TIT.2014.2320500. URL https://doi.org/10.1109/TIT.2014.2320500.
  60. Technical privacy metrics: A systematic survey. ACM Comput. Surv., 51(3):57:1–57:38, 2018. doi: 10.1145/3168389. URL https://doi.org/10.1145/3168389.
  61. A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375–389, 2010. doi: 10.1198/jasa.2009.tm08651. URL https://doi.org/10.1198/jasa.2009.tm08651.
Citations (1)

Summary

We haven't generated a summary for this paper yet.