Differentially Private Worst-group Risk Minimization
Abstract: We initiate a systematic study of worst-group risk minimization under $(\epsilon, \delta)$-differential privacy (DP). The goal is to privately find a model that approximately minimizes the maximal risk across $p$ sub-populations (groups) with different distributions, where each group distribution is accessed via a sample oracle. We first present a new algorithm that achieves excess worst-group population risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon} + \sqrt{\frac{p}{K}})$, where $K$ is the total number of samples drawn from all groups and $d$ is the problem dimension. Our rate is nearly optimal when each distribution is observed via a fixed-size dataset of size $K/p$. Our result is based on a new stability-based analysis for the generalization error. In particular, we show that $\Delta$-uniform argument stability implies $\tilde{O}(\Delta + \frac{1}{\sqrt{n}})$ generalization error w.r.t. the worst-group risk, where $n$ is the number of samples drawn from each sample oracle. Next, we propose an algorithmic framework for worst-group population risk minimization using any DP online convex optimization algorithm as a subroutine. Hence, we give another excess risk bound of $\tilde{O}\left( \sqrt{\frac{d{1/2}}{\epsilon K}} +\sqrt{\frac{p}{K\epsilon2}} \right)$. Assuming the typical setting of $\epsilon=\Theta(1)$, this bound is more favorable than our first bound in a certain range of $p$ as a function of $K$ and $d$. Finally, we study differentially private worst-group empirical risk minimization in the offline setting, where each group distribution is observed by a fixed-size dataset. We present a new algorithm with nearly optimal excess risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon})$.
- “Active Sampling for Min-Max Fairness.” In International Conference on Machine Learning 162, 2022
- “Private stochastic convex optimization with optimal rates” In Advances in neural information processing systems 32, 2019
- Raef Bassily, Cristóbal Guzmán and Michael Menart “Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap” In arXiv preprint arXiv:2302.12909, 2023
- “Collaborative PAC learning” In Advances in Neural Information Processing Systems 30, 2017
- Raef Bassily, Adam Smith and Abhradeep Thakurta “Private empirical risk minimization: Efficient algorithms and tight error bounds” In 2014 IEEE 55th annual symposium on foundations of computer science, 2014, pp. 464–473 IEEE
- “Minimax group fairness: Algorithms and experiments” In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 2021, pp. 66–76
- “Calibrating noise to sensitivity in private data analysis” In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, 2006, pp. 265–284 Springer
- Vitaly Feldman, Tomer Koren and Kunal Talwar “Private stochastic convex optimization: optimal rates in linear time” In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, 2020, pp. 439–449
- “High probability generalization bounds for uniformly stable algorithms with nearly optimal rate” In Conference on Learning Theory, 2019, pp. 1270–1279 PMLR
- Tomás González, Cristóbal Guzmán and Courtney Paquette “Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems” Unpublished manuscript, 2024
- Nika Haghtalab, Michael Jordan and Eric Zhao “On-demand sampling: Learning optimally from multiple distributions” In Advances in Neural Information Processing Systems 35, 2022, pp. 406–419
- “Practical and private (deep) learning without sampling or shuffling” In International Conference on Machine Learning, 2021, pp. 5213–5225 PMLR
- Mehryar Mohri, Gary Sivek and Ananda Theertha Suresh “Agnostic federated learning” In International Conference on Machine Learning, 2019, pp. 4615–4625 PMLR
- “Robust stochastic approximation approach to stochastic programming” In SIAM Journal on optimization 19.4 SIAM, 2009, pp. 1574–1609
- “What is a Good Metric to Study Generalization of Minimax Learners?” In Advances in Neural Information Processing Systems 35, 2022, pp. 38190–38203
- Francesco Orabona “A modern introduction to online learning” In arXiv preprint arXiv:1912.13213, 2019
- “Minimax demographic group fairness in federated learning” In 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 142–159
- Tasuku Soma, Khashayar Gatmiry and Stefanie Jegelka “Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems” In arXiv preprint arXiv:2212.13669, 2022
- “Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization” In arXiv preprint arXiv:1911.08731, 2019
- “Differentially private sgda for minimax problems” In Uncertainty in Artificial Intelligence, 2022, pp. 2192–2202 PMLR
- “Generalization bounds for stochastic saddle point problems” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 568–576 PMLR
- “Bring your own algorithm for optimal differentially private stochastic minimax optimization” In Advances in Neural Information Processing Systems 35, 2022, pp. 35174–35187
- “Stochastic Approximation Approaches to Group Distributionally Robust Optimization” In arXiv preprint arXiv:2302.09267, 2023
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.