Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast Optimal Locally Private Mean Estimation via Random Projections (2306.04444v2)

Published 7 Jun 2023 in cs.LG, cs.CR, and stat.ML

Abstract: We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a $1+o(1)$-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lower-dimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server run-time. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. “The fast Johnson-Lindenstrauss transform and approximate nearest neighbors” In SIAM Journal on Computing 39.1, 2009, pp. 302–322
  2. “Deep Learning with Differential Privacy” In Proceedings of the 23rd Annual ACM Conference on Computer and Communications Security (CCS), 2016, pp. 308–318
  3. “Private Adaptive Gradient Methods for Convex Optimization” In Proceedings of the 38th International Conference on Machine Learning (ICML), 2021, pp. 383–392
  4. Hilal Asi, Vitaly Feldman and Kunal Talwar “Optimal Algorithms for Mean Estimation under Local Differential Privacy” In Proceedings of the 39th International Conference on Machine Learning (ICML), 2022
  5. “QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding” In Advances in Neural Information Processing Systems 30 Curran Associates, Inc., 2017, pp. 1709–1720 URL: https://proceedings.neurips.cc/paper/2017/file/6c340f25839e6acdc73414517203f5f0-Paper.pdf
  6. N. Alon and J. H. Spencer “The Probabilistic Method” Wiley-Interscience, 2000
  7. “cpSGD: Communication-efficient and differentially-private distributed SGD” In Proceedings of the 31nd Annual Conference on Advances in Neural Information Processing Systems (NeurIPS), 2018
  8. “Protection Against Reconstruction and Its Applications in Private Federated Learning” In arXiv:1812.00984 [stat.ML], 2018
  9. “Prochlo: Strong Privacy for Analytics in the Crowd” In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP ’17 Shanghai, China: Association for Computing Machinery, 2017, pp. 441–459 DOI: 10.1145/3132747.3132769
  10. “Practical Secure Aggregation for Privacy-Preserving Machine Learning” In Proceedings of the Annual ACM SIGSAC Conference on Computer and Communications Security (CCS), 2017, pp. 1175–1191
  11. “Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising” In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018, pp. 394–403 PMLR
  12. Kamalika Chaudhuri, Chuan Guo and Mike Rabbat “Privacy-aware compression for federated data analysis” In Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence 180 PMLR, 2022, pp. 296–306 URL: https://proceedings.mlr.press/v180/chaudhuri22a.html
  13. Wei-Ning Chen, Peter Kairouz and Ayfer Özgür “Breaking the Communication-Privacy-Accuracy Trilemma” In Proceedings of the 33rd Annual Conference on Advances in Neural Information Processing Systems (NeurIPS), 2020
  14. Kamalika Chaudhuri, Claire Monteleoni and Anand D. Sarwate “Differentially private empirical risk minimization” In Journal of Machine Learning Research 12, 2011, pp. 1069–1109
  15. Michael B. Cohen, Jelani Nelson and David P. Woodruff “Optimal Approximate Matrix Product in Terms of Stable Rank” Full version at https://arxiv.org/abs/1507.02268v3 In Proceedings of the 41st International Colloquium on Automata, Languages and Programming (ICALP), 2016, pp. 11:1–11:14
  16. “Distributed Differential Privacy via Shuffling” In Advances in Cryptology – EUROCRYPT 2019 Cham: Springer International Publishing, 2019, pp. 375–403
  17. John C. Duchi, Michael I. Jordan and Martin J. Wainwright “Minimax Optimal Procedures for Locally Private Estimation” In Journal of the American Statistical Association 113.521, 2018, pp. 182–215
  18. “Our Data, Ourselves: Privacy Via Distributed Noise Generation” In Advances in Cryptology (EUROCRYPT 2006), 2006
  19. “Calibrating noise to sensitivity in private data analysis” In Proceedings of the Third Theory of Cryptography Conference, 2006, pp. 265–284
  20. “Lower Bounds for Locally Private Estimation via Communication Complexity” In Proceedings of the 32nd Annual Conference on Learning Theory (COLT), 2019, pp. 1161–1191
  21. “Local operator theory, random matrices and Banach spaces” In Handbook on the Geometry of Banach spaces, Vol. 1, 2003, pp. 317–366
  22. “Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity” In Proceedings of the Thirtieth ACM-SIAM Symposium on Discrete Algorithms (SODA), 2019
  23. Vitaly Feldman, Audra McMillan and Kunal Talwar “Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling” arXiv:2012.12803 [cs.LG] In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), 2022, pp. 954–964 DOI: 10.1109/FOCS52979.2021.00096
  24. “Lossless Compression of Efficient Private Local Randomizers” In Proceedings of the 38th International Conference on Machine Learning 139 PMLR, 2021, pp. 3208–3219
  25. “Adaptive Gradient Quantization for Data-Parallel SGD” In Advances in Neural Information Processing Systems 33, 2020
  26. “Shuffled Model of Federated Learning: Privacy, Communication and Accuracy Trade-offs”, 2020 arXiv:2008.07180 [cs.LG]
  27. Parikshit Gopalan, Daniek Kane and Raghu Meka “Pseudorandomness via the Discrete Fourier Transform”, 2015, pp. 903–922 DOI: 10.1109/FOCS.2015.60
  28. “vqsgd: Vector quantized stochastic gradient descent” In arXiv preprint arXiv:1911.07971, 2019
  29. Pravesh K. Kothari and Raghu Meka “Almost Optimal Pseudorandom Generators for Spherical Caps: Extended Abstract” In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15 Portland, Oregon, USA: Association for Computing Machinery, 2015, pp. 247–256 DOI: 10.1145/2746539.2746611
  30. “Randomized distributed mean estimation: Accuracy vs. communication” In Frontiers in Applied Mathematics and Statistics 4 Frontiers Media SA, 2018, pp. 62
  31. Yann LeCun, Corinna Cortes and CJ Burges “MNIST Handwritten Digit Database” ATT Labs [Online], 1998 URL: http://yann.lecun.com/exdb/mnist
  32. “Limits on Gradient Compression for Stochastic Optimization” In 2020 IEEE International Symposium on Information Theory (ISIT), 2020, pp. 2658–2663 DOI: 10.1109/ISIT44484.2020.9174075
  33. Noam Nisan “Pseudorandom generators for space-bounded computation” In Combinatorica 12, 1992, pp. 449–461
  34. “AdaCliP: Adaptive clipping for private SGD” In arXiv:1908.07643 [cs.LG], 2020
  35. Tamás Sarlós “Improved Approximation Algorithms for Large Matrices via Random Projections” In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2006, pp. 143–152
  36. Jeanette P. Schmidt, Alan Siegel and Aravind Srinivasan “Chernoff–Hoeffding Bounds for Applications with Limited Independence” In SIAM Journal on Discrete Mathematics 8.2, 1995, pp. 223–250 DOI: 10.1137/S089548019223872X
  37. “Distributed mean estimation with limited communication” In Proceedings of the 34th International Conference on Machine Learning (ICML), 2017
  38. “Drive: One-bit distributed mean estimation” In Proceedings of the 34nd Annual Conference on Advances in Neural Information Processing Systems (NeurIPS), 2021
  39. “Eden: Communication-efficient and robust distributed mean estimation for federated learning” In Proceedings of the 39th International Conference on Machine Learning (ICML), 2022
Citations (12)

Summary

We haven't generated a summary for this paper yet.