Mean Estimation Under Heterogeneous Privacy Demands (2310.13137v1)
Abstract: Differential Privacy (DP) is a well-established framework to quantify privacy loss incurred by any algorithm. Traditional formulations impose a uniform privacy requirement for all users, which is often inconsistent with real-world scenarios in which users dictate their privacy preferences individually. This work considers the problem of mean estimation, where each user can impose their own distinct privacy level. The algorithm we propose is shown to be minimax optimal and has a near-linear run-time. Our results elicit an interesting saturation phenomenon that occurs. Namely, the privacy requirements of the most stringent users dictate the overall error rates. As a consequence, users with less but differing privacy requirements are all given more privacy than they require, in equal amounts. In other words, these privacy-indifferent users are given a nontrivial degree of privacy for free, without any sacrifice in the performance of the estimator.
- S. Chaudhuri and T. Courtade, “Mean estimation under heterogeneous privacy: Some privacy can be free,” in 2023 IEEE International Symposium on Information Theory (ISIT), 2023.
- C. Alaimo and J. Kallinikos, “Computing the everyday: Social media as data platforms,” The Information Society, 2017.
- A. Aaltonen, C. Alaimo, and J. Kallinikos, “The making of data commodities: Data analytics as an embedded process,” Journal of Management Information Systems, 2021.
- EU, “Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (General Data Protection Regulation),” pp. 1–88, May 2016.
- CA, “California Consumer Privacy Act (CCPA),” Office of the Attorney General, California Department of Justice, 2018.
- L. J. Hoffman, “Computers and privacy: A survey,” ACM Computing Surveys, 1969.
- R. Agrawal and R. Srikant, “Privacy-preserving data mining,” SIGMOD Rec., 2000.
- P. R. M. Rao, S. Murali Krishna, and A. P. Siva Kumar, “Privacy preservation techniques in big data analytics: a survey,” Journal of Big Data, 2018.
- J. Schlörer, “Identification and retrieval of personal records from a statistical data bank,” Methods of information in medicine, 1975.
- P. Samarati and L. Sweeney, “Generalizing data to provide anonymity when disclosing information,” in PODS, 1998.
- A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, “L-diversity: privacy beyond k-anonymity,” in 22nd International Conference on Data Engineering (ICDE), 2006.
- S. Asoodeh, F. Alajaji, and T. Linder, “Notes on information-theoretic privacy,” in 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2014, pp. 1272–1278.
- C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor, “Our data, ourselves: Privacy via distributed noise generation,” in Annual international conference on the theory and applications of cryptographic techniques. Springer, 2006.
- C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of cryptography conference. Springer, 2006.
- J. M. Abowd, “The u.s. census bureau adopts differential privacy,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 2018.
- U. Erlingsson, V. Pihur, and A. Korolova, “Rappor: Randomized aggregatable privacy-preserving ordinal response,” in Proceedings of the 21st ACM Conference on Computer and Communications Security, Scottsdale, Arizona, 2014.
- J. Tang, A. Korolova, X. Bai, X. Wang, and X. Wang, “Privacy loss in apple’s implementation of differential privacy on macos 10.12,” arXiv preprint arXiv:1709.02753, 2017.
- I. Mironov, “Rényi differential privacy,” in 2017 IEEE 30th computer security foundations symposium (CSF). IEEE, 2017.
- C. Dwork and G. N. Rothblum, “Concentrated differential privacy,” arXiv preprint arXiv:1603.01887, 2016.
- M. Bun and T. Steinke, “Concentrated differential privacy: Simplifications, extensions, and lower bounds,” in Theory of Cryptography Conference. Springer, 2016.
- T. Wang, X. Zhang, J. Feng, and X. Yang, “A comprehensive survey on local differential privacy toward data statistics and analysis,” Sensors, 2020.
- I. Kotsogiannis, S. Doudalis, S. Haney, A. Machanavajjhala, and S. Mehrotra, “One-sided differential privacy,” in 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 2020.
- G. Kamath, J. Li, V. Singhal, and J. Ullman, “Privately learning high-dimensional distributions,” in Conference on Learning Theory. PMLR, 2019.
- S. B. Hopkins, G. Kamath, and M. Majid, “Efficient mean estimation with pure differential privacy via a sum-of-squares exponential mechanism,” in Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, 2022.
- G. Kamath and J. Ullman, “A primer on private statistics,” arXiv preprint arXiv:2005.00010, 2020.
- T. T. Cai, Y. Wang, and L. Zhang, “The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy,” The Annals of Statistics, 2021.
- S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?” SIAM Journal on Computing, 2011.
- J. C. Duchi, M. J. Wainwright, and M. I. Jordan, “Minimax optimal procedures for locally private estimation,” Journal of the American Statistical Association, 2016.
- R. Bassily, A. Cheu, S. Moran, A. Nikolov, J. Ullman, and S. Wu, “Private query release assisted by public data,” in International Conference on Machine Learning. PMLR, 2020.
- R. Bassily, S. Moran, and A. Nandi, “Learning from mixtures of private and public populations,” Advances in Neural Information Processing Systems, 2020.
- T. Liu, G. Vietri, T. Steinke, J. Ullman, and S. Wu, “Leveraging public data for practical private query release,” in International Conference on Machine Learning. PMLR, 2021.
- N. Alon, R. Bassily, and S. Moran, “Limits of private learning with access to public data,” Advances in neural information processing systems, 2019.
- A. Nandi and R. Bassily, “Privately answering classification queries in the agnostic pac model,” in Algorithmic Learning Theory. PMLR, 2020.
- P. Kairouz, M. R. Diaz, K. Rush, and A. Thakurta, “(Nearly) dimension independent private erm with adagrad rates via publicly estimated subspaces,” in Conference on Learning Theory. PMLR, 2021.
- E. Amid, A. Ganesh, R. Mathews, S. Ramaswamy, S. Song, T. Steinke, V. M. Suriyakumar, O. Thakkar, and A. Thakurta, “Public data-assisted mirror descent for private model training,” in International Conference on Machine Learning. PMLR, 2022.
- D. Wang, H. Zhang, M. Gaboardi, and J. Xu, “Estimating smooth glm in non-interactive local differential privacy model with public unlabeled data,” in Algorithmic Learning Theory. PMLR, 2021.
- Z. Ji and C. Elkan, “Differential privacy based on importance weighting,” Machine learning, 2013.
- A. Bie, G. Kamath, and V. Singhal, “Private estimation with public data,” in Advances in Neural Information Processing Systems, 2022.
- B. Avent, A. Korolova, D. Zeber, T. Hovden, and B. Livshits, “BLENDER: Enabling local search with a hybrid differential privacy model,” in 26th USENIX Security Symposium, 2017.
- A. Beimel, A. Korolova, K. Nissim, O. Sheffet, and U. Stemmer, “The Power of Synergy in Differential Privacy: Combining a Small Curator with Local Randomizers,” in 1st Conference on Information-Theoretic Cryptography (ITC 2020), 2020.
- J. Liu, J. Lou, L. Xiong, J. Liu, and X. Meng, “Projected federated averaging with heterogeneous differential privacy,” Proceedings of the VLDB Endowment, 2021.
- N. Aldaghri, H. Mahdavifar, and A. Beirami, “Feo2: Federated learning with opt-out differential privacy,” CoRR, vol. abs/2110.15252, 2021. [Online]. Available: https://arxiv.org/abs/2110.15252
- M. Alaggan, S. Gambs, and A.-M. Kermarrec, “Heterogeneous differential privacy,” Journal of Privacy and Confidentiality, 2017.
- H. Li, L. Xiong, Z. Ji, and X. Jiang, “Partitioning-based mechanisms under personalized differential privacy,” in Pacific-asia conference on knowledge discovery and data mining. Springer, 2017.
- Z. Jorgensen, T. Yu, and G. Cormode, “Conservative or liberal? personalized differential privacy,” in 2015 IEEE 31St international conference on data engineering, 2015.
- C. Ferrando, J. Gillenwater, and A. Kulesza, “Combining public and private data,” in NeurIPS 2021 Workshop Privacy in Machine Learning, 2021.
- B. Niu, Y. Chen, B. Wang, J. Cao, and F. Li, “Utility-aware exponential mechanism for personalized differential privacy,” in 2020 IEEE Wireless Communications and Networking Conference (WCNC), 2020.
- F. McSherry and K. Talwar, “Mechanism design via differential privacy,” in 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). IEEE, 2007.
- Y. Li, S. Liu, J. Wang, and M. Liu, “A local-clustering-based personalized differential privacy framework for user-based collaborative filtering,” in International Conference on Database Systems for Advanced Applications. Springer, 2017.
- S. Zhang, L. Liu, Z. Chen, and H. Zhong, “Probabilistic matrix factorization with personalized differential privacy,” Knowledge-Based Systems, 2019.
- R. Chen, H. Li, A. K. Qin, S. P. Kasiviswanathan, and H. Jin, “Private spatial data aggregation in the local setting,” in 2016 IEEE 32nd International Conference on Data Engineering (ICDE), 2016.
- S. Torkamani, J. B. Ebrahimi, P. Sadeghi, R. G. D’Oliveira, and M. Médard, “Heterogeneous differential privacy via graphs,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022.
- K. Chatzikokolakis, M. E. Andrés, N. E. Bordenabe, and C. Palamidessi, “Broadening the scope of differential privacy using metrics,” in International Symposium on Privacy Enhancing Technologies Symposium. Springer, 2013.
- M. Noble, A. Bellet, and A. Dieuleveut, “Differentially private federated learning on heterogeneous data,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2022.
- S. Gupta, A. B. Buduru, and P. Kumaraguru, “Differential privacy: a privacy cloak for preserving utility in heterogeneous datasets,” CSI Transactions on ICT, 2022.
- R. Cummings*, V. Feldman*, A. McMillan*, and K. Talwar*, “Mean estimation with user-level privacy under data heterogeneity,” in NeurIPS, 2022. [Online]. Available: https://openreview.net/pdf?id=oYbQDV3mon-
- A. Fallah, A. Makhdoumi, A. Malekian, and A. Ozdaglar, “Optimal and differentially private data acquisition: Central and local mechanisms,” arXiv preprint arXiv:2201.03968, 2022.
- R. Cummings, H. Elzayn, E. Pountourakis, V. Gkatzelis, and J. Ziani, “Optimal data acquisition with privacy-aware agents,” in 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). IEEE, 2023, pp. 210–224.
- R. F. Barber and J. C. Duchi, “Privacy and statistical risk: Formalisms and minimax bounds,” 2014.
- J. Duchi, M. Jordan, and M. Wainwright, “Local privacy and minimax bounds: Sharp rates for probability estimation,” Adv. Neural Inform. Process. Syst, 2013.
- C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.