User-Level Differential Privacy With Few Examples Per User (2309.12500v1)
Abstract: Previous work on user-level differential privacy (DP) [Ghazi et al. NeurIPS 2021, Bun et al. STOC 2023] obtained generic algorithms that work for various learning tasks. However, their focus was on the example-rich regime, where the users have so many examples that each user could themselves solve the problem. In this work we consider the example-scarce regime, where each user has only a few examples, and obtain the following results: 1. For approximate-DP, we give a generic transformation of any item-level DP algorithm to a user-level DP algorithm. Roughly speaking, the latter gives a (multiplicative) savings of $O_{\varepsilon,\delta}(\sqrt{m})$ in terms of the number of users required for achieving the same utility, where $m$ is the number of examples per user. This algorithm, while recovering most known bounds for specific problems, also gives new bounds, e.g., for PAC learning. 2. For pure-DP, we present a simple technique for adapting the exponential mechanism [McSherry, Talwar FOCS 2007] to the user-level setting. This gives new bounds for a variety of tasks, such as private PAC learning, hypothesis selection, and distribution learning. For some of these problems, we show that our bounds are near-optimal.
- John M Abowd. The US Census Bureau adopts differential privacy. In KDD, pages 2867–2867, 2018.
- Discrete distribution estimation under user-level local differential privacy. In AISTATS, 2023.
- Apple Differential Privacy Team. Learning with privacy at scale. Apple Machine Learning Journal, 2017.
- Differentially private Assouad, Fano, and Le Cam. In ALT, pages 48–78, 2021.
- Privacy amplification by subsampling: Tight analyses via couplings and divergences. In NeurIPS, pages 6280–6290, 2018.
- Typicality-based stability and privacy. CoRR, abs/1604.03336, 2016.
- Stability is stable: Connections between replicability, privacy, and adaptive generalization. In STOC, 2023.
- Private hypothesis selection. In NeurIPS, pages 156–167, 2019.
- Characterizing the sample complexity of pure private learners. JMLR, 20:146:1–146:33, 2019.
- Adaptive learning with robust generalization guarantees. In COLT, pages 772–814, 2016.
- Differentially private learning of structured discrete distributions. In NIPS, pages 2566–2574, 2015.
- Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486–503, 2006.
- Collecting telemetry data privately. In NeurIPS, pages 3571–3580, 2017.
- Differential privacy and robust statistics. In STOC, pages 371–380, 2009.
- Calibrating noise to sensitivity in private data analysis. In TCC, pages 265–284, 2006.
- The minimax learning rates of normal and Ising undirected graphical models. Electronic Journal of Statistics, 14(1):2338 – 2361, 2020.
- Sample-efficient proper PAC learning with approximate differential privacy. In STOC, 2021.
- On user-level private convex optimization. In ICML, 2023.
- User-level differentially private learning via correlated sampling. In NeurIPS, pages 20172–20184, 2021.
- Andy Greenberg. Apple’s “differential privacy” is about collecting your data – but not your data. Wired, June, 13, 2016.
- Reproducibility in learning. In STOC, pages 818–831, 2022.
- Differential privacy for black-box statistical analyses. In TPDP, 2021.
- Privately learning high-dimensional distributions. In COLT, pages 1853–1902, 2019.
- Samuel Kutin. Extensions to McDiarmid’s inequality when differences are bounded with high probability. Dept. Comput. Sci., Univ. Chicago, Chicago, IL, USA, Tech. Rep. TR-2002-04, 2002.
- Learning with user-level privacy. In NeurIPS, pages 12466–12479, 2021.
- Learning discrete distributions: user vs item-level privacy. In NeurIPS, 2020.
- Mechanism design via differential privacy. In FOCS, pages 94–103, 2007.
- Tight and robust private mean estimation with few users. In ICML, pages 16383–16412, 2022.
- Introducing TensorFlow Privacy: Learning with Differential Privacy for Training Data, March 2019. blog.tensorflow.org.
- Michel Talagrand. Concentration of measure and isoperimetric inequalities in product spaces. Publications Mathématiques de l’Institut des Hautes Etudes Scientifiques, 81:73–205, 1995.
- PyTorch Differential Privacy Series Part 1: DP-SGD Algorithm Explained, August 2020. medium.com.
- Salil P. Vadhan. The complexity of differential privacy. In Tutorials on the Foundations of Cryptography, pages 347–450. Springer International Publishing, 2017.
- Lutz Warnke. On the method of typical bounded differences. Comb. Probab. Comput., 25(2):269–299, 2016.
- Badih Ghazi (78 papers)
- Pritish Kamath (48 papers)
- Ravi Kumar (146 papers)
- Pasin Manurangsi (127 papers)
- Raghu Meka (55 papers)
- Chiyuan Zhang (57 papers)