Papers
Topics
Authors
Recent
2000 character limit reached

Differentially Private Worst-group Risk Minimization

Published 29 Feb 2024 in cs.LG, cs.AI, and cs.CR | (2402.19437v1)

Abstract: We initiate a systematic study of worst-group risk minimization under $(\epsilon, \delta)$-differential privacy (DP). The goal is to privately find a model that approximately minimizes the maximal risk across $p$ sub-populations (groups) with different distributions, where each group distribution is accessed via a sample oracle. We first present a new algorithm that achieves excess worst-group population risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon} + \sqrt{\frac{p}{K}})$, where $K$ is the total number of samples drawn from all groups and $d$ is the problem dimension. Our rate is nearly optimal when each distribution is observed via a fixed-size dataset of size $K/p$. Our result is based on a new stability-based analysis for the generalization error. In particular, we show that $\Delta$-uniform argument stability implies $\tilde{O}(\Delta + \frac{1}{\sqrt{n}})$ generalization error w.r.t. the worst-group risk, where $n$ is the number of samples drawn from each sample oracle. Next, we propose an algorithmic framework for worst-group population risk minimization using any DP online convex optimization algorithm as a subroutine. Hence, we give another excess risk bound of $\tilde{O}\left( \sqrt{\frac{d{1/2}}{\epsilon K}} +\sqrt{\frac{p}{K\epsilon2}} \right)$. Assuming the typical setting of $\epsilon=\Theta(1)$, this bound is more favorable than our first bound in a certain range of $p$ as a function of $K$ and $d$. Finally, we study differentially private worst-group empirical risk minimization in the offline setting, where each group distribution is observed by a fixed-size dataset. We present a new algorithm with nearly optimal excess risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon})$.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. “Active Sampling for Min-Max Fairness.” In International Conference on Machine Learning 162, 2022
  2. “Private stochastic convex optimization with optimal rates” In Advances in neural information processing systems 32, 2019
  3. Raef Bassily, Cristóbal Guzmán and Michael Menart “Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap” In arXiv preprint arXiv:2302.12909, 2023
  4. “Collaborative PAC learning” In Advances in Neural Information Processing Systems 30, 2017
  5. Raef Bassily, Adam Smith and Abhradeep Thakurta “Private empirical risk minimization: Efficient algorithms and tight error bounds” In 2014 IEEE 55th annual symposium on foundations of computer science, 2014, pp. 464–473 IEEE
  6. “Minimax group fairness: Algorithms and experiments” In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 2021, pp. 66–76
  7. “Calibrating noise to sensitivity in private data analysis” In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, 2006, pp. 265–284 Springer
  8. Vitaly Feldman, Tomer Koren and Kunal Talwar “Private stochastic convex optimization: optimal rates in linear time” In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, 2020, pp. 439–449
  9. “High probability generalization bounds for uniformly stable algorithms with nearly optimal rate” In Conference on Learning Theory, 2019, pp. 1270–1279 PMLR
  10. Tomás González, Cristóbal Guzmán and Courtney Paquette “Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems” Unpublished manuscript, 2024
  11. Nika Haghtalab, Michael Jordan and Eric Zhao “On-demand sampling: Learning optimally from multiple distributions” In Advances in Neural Information Processing Systems 35, 2022, pp. 406–419
  12. “Practical and private (deep) learning without sampling or shuffling” In International Conference on Machine Learning, 2021, pp. 5213–5225 PMLR
  13. Mehryar Mohri, Gary Sivek and Ananda Theertha Suresh “Agnostic federated learning” In International Conference on Machine Learning, 2019, pp. 4615–4625 PMLR
  14. “Robust stochastic approximation approach to stochastic programming” In SIAM Journal on optimization 19.4 SIAM, 2009, pp. 1574–1609
  15. “What is a Good Metric to Study Generalization of Minimax Learners?” In Advances in Neural Information Processing Systems 35, 2022, pp. 38190–38203
  16. Francesco Orabona “A modern introduction to online learning” In arXiv preprint arXiv:1912.13213, 2019
  17. “Minimax demographic group fairness in federated learning” In 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 142–159
  18. Tasuku Soma, Khashayar Gatmiry and Stefanie Jegelka “Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems” In arXiv preprint arXiv:2212.13669, 2022
  19. “Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization” In arXiv preprint arXiv:1911.08731, 2019
  20. “Differentially private sgda for minimax problems” In Uncertainty in Artificial Intelligence, 2022, pp. 2192–2202 PMLR
  21. “Generalization bounds for stochastic saddle point problems” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 568–576 PMLR
  22. “Bring your own algorithm for optimal differentially private stochastic minimax optimization” In Advances in Neural Information Processing Systems 35, 2022, pp. 35174–35187
  23. “Stochastic Approximation Approaches to Group Distributionally Robust Optimization” In arXiv preprint arXiv:2302.09267, 2023
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.