Differentially private inference via noisy optimization (2103.11003v4)
Abstract: We propose a general optimization-based framework for computing differentially private M-estimators and a new method for constructing differentially private confidence regions. Firstly, we show that robust statistics can be used in conjunction with noisy gradient descent or noisy Newton methods in order to obtain optimal private estimators with global linear or quadratic convergence, respectively. We establish local and global convergence guarantees, under both local strong convexity and self-concordance, showing that our private estimators converge with high probability to a small neighborhood of the non-private M-estimators. Secondly, we tackle the problem of parametric inference by constructing differentially private estimators of the asymptotic variance of our private M-estimators. This naturally leads to approximate pivotal statistics for constructing confidence regions and conducting hypothesis testing. We demonstrate the effectiveness of a bias correction that leads to enhanced small-sample empirical performance in simulations. We illustrate the benefits of our methods in several numerical examples.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 308–318, 2016.
- Differentially private testing of identity and closeness of discrete distributions. Advances in Neural Information Processing Systems, 31:6878–6891, 2018.
- Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Transactions on Information Theory, 58(5):3235–3249, 2012.
- M. Avella Medina. Privacy-preserving parametric inference: A case for robust statistics. Journal of the American Statistical Association, (116):969–983, 2021.
- J. Awan and A. Slavković. Differentially private uniformly most powerful tests for binomial data. Advances in Neural Information Processing Systems, 2018:4208–4218, 2018.
- F. Bach. Self-concordant analysis for logistic regression. Electronic Journal of Statistics, 4:384–414, 2010.
- Privacy amplification via random check-ins. Advances in Neural Information Processing Systems, 33, 2020.
- R. F. Barber and J. Duchi. Privacy: A few definitional aspects and consequences for minimax mean-squared error. In 53rd IEEE Conference on Decision and Control, pages 1365–1369. IEEE, 2014.
- Differentially private significance tests for regression coefficients. Journal of Computational and Graphical Statistics, 28(2):440–453, 2019.
- Private empirical risk minimization: Efficient algorithms and tight error bounds. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 464–473. IEEE, 2014.
- Private stochastic convex optimization with optimal rates. arXiv preprint arXiv:1908.09970, 2019.
- S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
- Deep learning with Gaussian differential privacy. Harvard Data Science Review, 2020(23), 2020.
- S. Bubeck. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3-4):231–357, 2015.
- The cost of privacy in generalized linear models: Algorithms and minimax lower bounds. arXiv preprint arXiv:2011.03900, 2020.
- The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. Annals of Statistics, 49(5):2825–2850, 2021.
- E. Cantoni and E. Ronchetti. Robust inference for generalized linear models. Journal of the American Statistical Association, 96(455):1022–1030, 2001.
- Private confidence sets. In NeurIPS 2021 Workshop Privacy in Machine Learning, 2021.
- K. Chaudhuri and C. Monteleoni. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems, volume 22, pages 289–296. Citeseer, 2008.
- Differentially private empirical risk minimization. Journal of Machine Learning Research, 12(3), 2011.
- Unbiased statistical estimation and valid confidence intervals under differential privacy. arXiv preprint arXiv:2110.14465, 2021.
- A. d’Aspremont. Smooth optimization with approximate gradient. SIAM Journal on Optimization, 19(3):1171–1183, 2008.
- First-order methods of smooth convex optimization with inexact oracle. Mathematical Programming, 146(1):37–75, 2014.
- Collecting telemetry data privately. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 3574–3583, 2017.
- Gaussian differential privacy. Journal of the Royal Statistical Society: Series B (to appear), 2021.
- Minimax optimal procedures for locally private estimation. Journal of the American Statistical Association, 113(521):182–201, 2018.
- C. Dwork and A. Roth. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
- Analyze Gauss: Optimal bounds for privacy-preserving principal component analysis. In Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing, pages 11–20, 2014.
- Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pages 1054–1067, 2014.
- Limiting privacy breaches in privacy preserving data mining. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 211–222, 2003.
- Private stochastic convex optimization: optimal rates in linear time. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 439–449, 2020.
- Optimal robust M-estimates of location. Annals of Statistics, pages 194–223, 2001.
- Differentially private chi-squared hypothesis testing: Goodness of fit and independence testing. In International Conference on Machine Learning, pages 2111–2120. PMLR, 2016.
- Faster differentially private convex optimization via second-order methods. arXiv preprint arXiv:2305.13209, 2023.
- Understanding database reconstruction attacks on public data. Communications of the ACM, 62(3):46–53, 2019.
- S. Ghadimi and G. Lan. Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization i: A generic algorithmic framework. SIAM Journal on Optimization, 22(4):1469–1492, 2012.
- A smoothing principle for the Huber and other location M-estimators. Computational Statistics & Data Analysis, 55(1):324–337, 2011.
- Robust Statistics: The Approach Based on Influence Functions, volume 196. John Wiley & Sons, 1986.
- P. J. Huber. Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1):73–101, 1964.
- P. J. Huber. The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 221–233. University of California Press, 1967.
- P. J. Huber and E. Ronchetti. Robust Statistics. Wiley, New York, second edition, 2009.
- Towards practical differentially private convex optimization. In 2019 IEEE Symposium on Security and Privacy (SP), pages 299–316. IEEE, 2019.
- P. Jain and A. Guha. Thakurta. (near) dimension independent risk bounds for differentially private learning. In International Conference on Machine Learning, pages 476–484. PMLR, 2014.
- Differentially private online learning. In Conference on Learning Theory, pages 24–1. JMLR Workshop and Conference Proceedings, 2012.
- Global linear convergence of Newton’s method without strong-convexity or Lipschitz gradients. arXiv preprint arXiv:1806.00413, 2018.
- V. Karwa and S. Vadhan. Finite sample differentially private confidence intervals. arXiv preprint arXiv:1711.03908, 2017.
- What can we learn privately? SIAM Journal on Computing, 40(3):793–826, 2011.
- Private convex empirical risk minimization and high-dimensional regression. In Conference on Learning Theory, pages 25–1. JMLR Workshop and Conference Proceedings, 2012.
- Differentially private Bayesian inference for generalized linear models. In International Conference on Machine Learning, pages 5838–5849. PMLR, 2021.
- Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. Journal of the American Statistical Association, 84(406):460–466, 1989.
- J. Lee and D. Kifer. Concentrated differentially private gradient descent with adaptive per-iteration privacy budget. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1656–1665, 2018.
- J. Lei. Differentially private m-estimators. Advances in Neural Information Processing Systems, 2011:361–369, 2011.
- P. Loh. Statistical consistency and asymptotic normality for high-dimensional robust M-estimators. The Annals of Statistics, 45(2):866–896, 2017.
- P. Loh and M. J. Wainwright. Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima. The Journal of Machine Learning Research, 16(1):559–616, 2015.
- A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62:22–31, 2014.
- Y. Nesterov. Lectures on Convex Optimization, volume 137. Springer, second edition, 2018.
- D. M. Ostrovskii and F. Bach. Finite-sample analysis of M-estimators using self-concordance. Electronic Journal of Statistics, 15(1):326–391, 2021.
- V. Peña and A. F. Barrientos. Differentially private methods for managing model uncertainty in linear regression models. arXiv preprint arXiv:2109.03949, 2021.
- A. Rajkumar and S. Agarwal. A differentially private stochastic gradient descent algorithm for multiparty classification. In Artificial Intelligence and Statistics, pages 933–941. PMLR, 2012.
- P. Rigollet and J.-C. Hütter. High-dimensional statistics. Lecture Notes for Course 18S997, 2017.
- R. Rogers and D. Kifer. A new class of private chi-square hypothesis tests. In Artificial Intelligence and Statistics, pages 991–1000. PMLR, 2017.
- F. Roosta-Khorasani and M. W. Mahoney. Sub-sampled Newton methods. Mathematical Programming, 174(1):293–326, 2019.
- Bayesian pseudo posterior mechanism under differential privacy. arXiv preprint arXiv:1909.11796, 2019.
- O. Sheffet. Differentially private ordinary least squares. In International Conference on Machine Learning, pages 3105–3114. PMLR, 2017.
- A. Slavkovic and R. Molinari. Perturbed M-estimation: A further investigation of robust statistics for differential privacy. arXiv preprint arXiv:2108.08266, 2021.
- Stochastic gradient descent with differentially private updates. In 2013 IEEE Global Conference on Signal and Information Processing, pages 245–248. IEEE, 2013.
- Evading the curse of dimensionality in unconstrained private GLMs. In International Conference on Artificial Intelligence and Statistics, pages 2638–2646. PMLR, 2021.
- T. Sun and Q. Tran-Dinh. Generalized self-concordant functions: A recipe for Newton-type methods. Mathematical Programming, 178(1-2):145–213, 2019.
- Composite convex optimization with global and local inexact oracles. Computational Optimization and Applications, pages 1–56, 2020.
- Nearly-optimal private LASSO. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2, pages 3025–3033, 2015.
- Privacy loss in Apple’s implementation of differential privacy on MacOS 10.12. arXiv preprint arXiv:1709.02753, 2017.
- J. A. Tropp. An introduction to matrix concentration inequalities. Foundations and Trends® in Machine Learning, 8(1-2):1–230, 2015.
- Privacy-preserving data sharing for genome-wide association studies. The Journal of Privacy and Confidentiality, 5(1):137, 2013.
- D. Vu and A. Slavkovic. Differential privacy for clinical trial data: Preliminary evaluations. In 2009 IEEE International Conference on Data Mining Workshops, pages 138–143. IEEE, 2009.
- Differentially private empirical risk minimization revisited: Faster and more general. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 2719–2728, 2017a.
- Stochastic quasi-Newton methods for nonconvex stochastic optimization. SIAM Journal on Optimization, 27(2):927–956, 2017b.
- Privacy for free: Posterior sampling and stochastic gradient Monte Carlo. In International Conference on Machine Learning, pages 2493–2502. PMLR, 2015.
- Differentially private confidence intervals for empirical risk minimization. Journal of Privacy and Confidentiality, 9(1), 2019.
- Y.-X. Wang. Revisiting differentially private linear regression: Optimal and adaptive prediction & estimation in unbounded domain. arXiv preprint arXiv:1803.02596v2, 2018.
- Stanley L Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
- L. Wasserman and S. Zhou. A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375–389, 2010.
- Differentially-private logistic regression for detecting multiple-SNP association in GWAS databases. In International Conference on Privacy in Statistical Databases, pages 170–184. Springer, 2014.