On the Computational Complexity of Private High-dimensional Model Selection (2310.07852v5)
Abstract: We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private (DP) best subset selection method with strong statistical utility properties by adopting the well-known exponential mechanism for selecting the best model. To achieve computational expediency, we propose an efficient Metropolis-Hastings algorithm and under certain regularity conditions, we establish that it enjoys polynomial mixing time to its stationary distribution. As a result, we also establish both approximate differential privacy and statistical utility for the estimates of the mixed Metropolis-Hastings chain. Finally, we perform some illustrative experiments on simulated data showing that our algorithm can quickly identify active features under reasonable privacy budget constraints.
- Information theoretic bounds for compressed sensing. IEEE Transactions on Information Theory, 56(10):5111–5130.
- Apple (2017). Learning with privacy at scale. https://docs-assets.developer.apple.com/ml-research/papers/learning-with-privacy-at-scale.pdf.
- Best subset selection via a modern optimization lens. The Annals of Statistics, 44(2):813–852.
- Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. The Annals of Statistics, 48(1):300 – 323.
- The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. The Annals of Statistics, 49(5):2825–2850.
- Practical differentially private top-k selection with pay-what-you-get composition. Advances in Neural Information Processing Systems, 32.
- Dwork, C. (2006). Differential privacy. In International colloquium on automata, languages, and programming, pages 1–12. Springer.
- Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28-June 1, 2006. Proceedings 25, pages 486–503. Springer.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer.
- Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456):1348–1360.
- Necessary and sufficient conditions for sparsity pattern recovery. IEEE Transactions on Information Theory, 55(12):5758–5772.
- Garfinkel, S. (2022). Differential privacy and the 2020 us census. https://mit-serc.pubpub.org/pub/differential-privacy-2020-us-census/release/1.
- Variable selection via gibbs sampling. Journal of the American Statistical Association, 88(423):881–889.
- Model selection and estimation in high dimensional regression models with group SCAD. Statistics & Probability Letters, 103:86–92.
- Best subset selection is robust against design dependence. arXiv preprint arXiv:2007.01478.
- Shotgun stochastic search for “large p” regression. Journal of the American Statistical Association, 102(478):507–516.
- Best subset, forward stepwise or lasso? analysis and recommendations based on extensive comparisons. Statistical Science, 35(4):579–592.
- Sparse regression at scale: Branch-and-bound rooted in first-order optimization. Mathematical Programming, 196(1-2):347–388.
- A variable selection method for genome-wide association studies. Bioinformatics, 27(1):1–8.
- A constructive approach to l0 penalized regression. The Journal of Machine Learning Research, 19(1):403–439.
- Adaptive lasso for sparse high-dimensional regression models. Statistica Sinica, pages 1603–1618.
- Spike and slab variable selection: Frequentist and Bayesian strategies. The Annals of Statistics, 33(2):730 – 773.
- (near) dimension independent risk bounds for differentially private learning. In International Conference on Machine Learning, pages 476–484. PMLR.
- Efficient private empirical risk minimization for high-dimensional learning. In International Conference on Machine Learning, pages 488–497. PMLR.
- Private convex empirical risk minimization and high-dimensional regression. In Conference on Learning Theory, pages 25–1. JMLR Workshop and Conference Proceedings.
- Adaptive monte carlo for bayesian variable selection in regression models. Journal of Computational and Graphical Statistics, 22(3):729–748.
- Differentially private model selection with penalized and constrained likelihood. Journal of the Royal Statistical Society Series A: Statistics in Society, 181(3):609–633.
- Majority vote for distributed differentially private sign selection. arXiv preprint arXiv:2209.04419.
- Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07), pages 94–103. IEEE.
- Lasso-type recovery of sparse representations for high-dimensional data. The Annals of Statistics, 37(1):246 – 270.
- A review of feature reduction techniques in neuroimaging. Neuroinformatics, 12:229–244.
- Skinny gibbs: A consistent and scalable gibbs sampler for model selection. Journal of the American Statistical Association.
- Differential privacy: Future work & open challenges. https://www.nist.gov/blogs/cybersecurity-insights/differential-privacy-future-work-open-challenges.
- Smooth sensitivity and sampling in private data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 75–84.
- Rad, K. R. (2011). Nearly sharp sufficient conditions on exact sparsity pattern recovery. IEEE Transactions on Information Theory, 57(7):4672–4679.
- A members first approach to enabling linkedin’s labor market insights at scale. arXiv preprint arXiv:2010.13981.
- High-dimensional variable selection with heterogeneous signals: A precise asymptotic perspective. arXiv preprint arXiv:2201.01508.
- Tale of two c (omplex) ities. arXiv preprint arXiv:2301.06259.
- Sinclair, A. (1992). Improved bounds for mixing rates of markov chains and multicommodity flow. Combinatorics, probability and Computing, 1(4):351–370.
- Smith, A. (2011). Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 813–822.
- Nearly optimal private lasso. Advances in Neural Information Processing Systems, 28.
- Differentially private feature selection via stability arguments, and the robustness of the lasso. In Shalev-Shwartz, S. and Steinwart, I., editors, Proceedings of the 26th Annual Conference on Learning Theory, volume 30 of Proceedings of Machine Learning Research, pages 819–850, Princeton, NJ, USA. PMLR.
- Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT -constrained quadratic programming (lasso). IEEE Transactions on Information Theory, 55(5):2183–2202.
- On sparse linear regression in the local differential privacy model. In International Conference on Machine Learning, pages 6628–6637. PMLR.
- Differentially private empirical risk minimization revisited: Faster and more general. Advances in Neural Information Processing Systems, 30.
- Wang, Y.-X. (2018). Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain. arXiv preprint arXiv:1803.02596.
- Convergence rate of Markov chain methods for genomic motif discovery. The Annals of Statistics, 41(1):91 – 124.
- On the computational complexity of high-dimensional Bayesian variable selection. The Annals of Statistics, 44(6):2497 – 2532.
- Differential privacy for expanding access to building energy data. In ACEEE Summer Study on Energy Efficiency in Buildings Proceedings.
- Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2):894–942.
- The sparsity and bias of the lasso selection in high-dimensional linear regression. The Annals of Statistics, 36(4):1567–1594.
- A general theory of concave regularization for high-dimensional sparse estimation problems. Statistical Science, 27(4):576–593.
- Data driven feature selection for machine learning algorithms in computer vision. IEEE Internet of Things Journal, 5(6):4262–4272.
- On model selection consistency of lasso. The Journal of Machine Learning Research, 7:2541–2563.
- abess: A fast best-subset selection library in python and r. Journal of Machine Learning Research, 23(202):1–7.