Efficient Sparse Least Absolute Deviation Regression with Differential Privacy (2401.01294v1)
Abstract: In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(\epsilon,\delta)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm.
- C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy.” Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3-4, pp. 211–407, 2014.
- D. Wang and J. Xu, “On sparse linear regression in the local differential privacy model,” in Proceedings of the 36th International Conference on Machine Learning, vol. 97, 2019, pp. 6628–6637.
- T. T. Cai, Y. Wang, and L. Zhang, “The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy,” The The Annals of Statistics, vol. 49, no. 5, pp. 2825–2850, 2021.
- D. Wang, A. Smith, and J. Xu, “High dimensional sparse linear regression under local differential privacy: Power and limitations,” in 2018 NIPS workshop in Privacy-Preserving Machine Learning, vol. 235, 2018.
- J. Ren, J. Xiong, Z. Yao, R. Ma, and M. Lin, “Dplk-means: A novel differential privacy k-means mechanism,” in 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), 2017, pp. 133–139.
- Y. Mülle, C. Clifton, and K. Böhm, “Privacy-integrated graph clustering through differential privacy.” in EDBT/ICDT Workshops, vol. 157, 2015.
- M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 308–318.
- T. Ha, T. K. Dang, T. T. Dang, T. A. Truong, and M. T. Nguyen, “Differential privacy in deep learning: An overview,” in 2019 International Conference on Advanced Computing and Applications (ACOMP), 2019, pp. 97–102.
- Y. Zhu, X. Yu, M. Chandraker, and Y.-X. Wang, “Private-knn: Practical differential privacy for computer vision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 854–11 862.
- J. L. Horowitz, “Bootstrap methods for median regression models,” Econometrica, vol. 66, no. 6, pp. 1327–1351, 1998.
- B. Jayaraman, L. Wang, D. Evans, and Q. Gu, “Distributed learning without distress: Privacy-preserving empirical risk minimization,” in Advances in Neural Information Processing Systems, vol. 31, 2018.
- J. Zhang, K. Zheng, W. Mou, and L. Wang, “Efficient private erm for smooth objectives,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017, pp. 3922–3928.
- L. Wang, B. Jayaraman, D. Evans, and Q. Gu, “Efficient privacy-preserving stochastic nonconvex optimization,” in Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, vol. 216, 2023, pp. 2203–2213.
- X. Zhang, M. Fang, J. Liu, and Z. Zhu, “Private and communication-efficient edge learning: a sparse differential gaussian-masking distributed sgd approach,” in Proceedings of the Twenty-First International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing, 2020, pp. 261–270.
- H. Wang and C. Li, “Distributed quantile regression over sensor networks,” IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 2, pp. 338–348, 2017.
- H. Wang, L. Xia, and C. Li, “Distributed online quantile regression over networks with quantized communication,” Signal Processing, vol. 157, pp. 141–150, 2019.
- Y. Tian, M. Tian, and Q. Zhu, “Linear quantile regression based on em algorithm,” Communications in Statistics-Theory and Methods, vol. 43, no. 16, pp. 3464–3484, 2014.
- Y. Tian, Q. Zhu, and M. Tian, “Estimation of linear composite quantile regression using em algorithm,” Statistics & Probability Letters, vol. 117, pp. 183–191, 2016.
- Y.-h. Zhou, Z.-x. Ni, and Y. Li, “Quantile regression via the em algorithm,” Communications in Statistics-Simulation and Computation, vol. 43, no. 10, pp. 2162–2172, 2014.
- L. Yu and N. Lin, “Admm for penalized quantile regression in big data,” International Statistical Review, vol. 85, no. 3, pp. 494–518, 2017.
- Y. Gu, J. Fan, L. Kong, S. Ma, and H. Zou, “Admm for high-dimensional sparse penalized quantile regression,” Technometrics, vol. 60, no. 3, pp. 319–331, 2018.
- X. He, X. Pan, K. M. Tan, and W.-X. Zhou, “Smoothed quantile regression with large-scale inference,” Journal of Econometrics, vol. 232, no. 2, pp. 367–388, 2023.
- P. J. Huber, “Robust statistics,” in International Encyclopedia of Statistical Science. Springer, 2011, pp. 1248–1251.
- C. Yu and W. Yao, “Robust linear regression: A review and comparison,” Communications in Statistics-Simulation and Computation, vol. 46, no. 8, pp. 6261–6282, 2017.
- A. F. Siegel, “Robust regression using repeated medians,” Biometrika, vol. 69, no. 1, pp. 242–244, 1982.
- P. J. Rousseeuw and C. Croux, “Alternatives to the median absolute deviation,” Journal of the American Statistical Association, vol. 88, no. 424, pp. 1273–1283, 1993.
- P. Rousseeuw and V. Yohai, “Robust regression by means of s-estimators,” in Robust and Nonlinear Time Series Analysis. Springer, 1984, pp. 256–272.
- Y. Li and J. Zhu, “ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-norm quantile regression,” Journal of Computational and Graphical Statistics, vol. 17, no. 1, pp. 163–185, 2008.
- A. Belloni and V. Chernozhukov, “ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-penalized quantile regression in high-dimensional sparse models,” The Annals of Statistics, vol. 39, no. 1, pp. 82–130, 2011.
- L. Wang, Y. Wu, and R. Li, “Quantile regression for analyzing heterogeneity in ultra-high dimension,” Journal of the American Statistical Association, vol. 107, no. 497, pp. 214–222, 2012.
- Q. Zheng, L. Peng, and X. He, “High dimensional censored quantile regression,” The Annals of Statistics, vol. 46, no. 1, pp. 308–343, 2018.
- L. Wang, “The ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT penalized lad estimator for high dimensional linear regression,” Journal of Multivariate Analysis, vol. 120, pp. 135–151, 2013.
- Y. Wu and Y. Liu, “Variable selection in quantile regression,” Statistica Sinica, vol. 19, no. 2, pp. 801–817, 2009.
- J. Fan, Y. Fan, and E. Barut, “Adaptive robust variable selection,” The Annals of Statistics, vol. 42, no. 1, pp. 324–351, 2014.
- J. Fan, L. Xue, and H. Zou, “Strong oracle optimality of folded concave penalized estimation,” The Annals of Statistics, vol. 42, no. 3, pp. 819–849, 2014.
- R. Koenker and P. Ng, “A frisch-newton algorithm for sparse quantile regression,” Acta Mathematicae Applicatae Sinica, vol. 21, no. 2, pp. 225–236, 2005.
- T. T. Wu and K. Lange, “Coordinate descent algorithms for lasso penalized regression,” The Annals of Applied Statistics, vol. 2, no. 1, pp. 224–244, 2008.
- B. Peng and L. Wang, “An iterative coordinate descent algorithm for high-dimensional nonconvex penalized quantile regression,” Journal of Computational and Graphical Statistics, vol. 24, no. 3, pp. 676–694, 2015.
- D. R. Hunter and K. Lange, “Quantile regression via an mm algorithm,” Journal of Computational and Graphical Statistics, vol. 9, no. 1, pp. 60–77, 2000.
- J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Minimax optimal procedures for locally private estimation,” Journal of the American Statistical Association, vol. 113, no. 521, pp. 182–201, 2018.
- Q. Zheng, S. Chen, Q. Long, and W. Su, “Federated f-differential privacy,” in Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, vol. 130. PMLR, 2021, pp. 2251–2259.
- K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, “Differentially private empirical risk minimization.” Journal of Machine Learning Research, vol. 12, no. 3, pp. 1069–1109, 2011.
- L. Wang and Q. Gu, “Differentially private iterative gradient hard thresholding for sparse learning,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019, pp. 3740–3747.
- D. Wang and J. Xu, “Differentially private ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-norm linear regression with heavy-tailed data,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 1856–1861.
- L. Hu, S. Ni, H. Xiao, and D. Wang, “High dimensional differentially private stochastic optimization with heavy-tailed data,” in Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2022, pp. 227–236.
- B. Balle, G. Barthe, and M. Gaboardi, “Privacy amplification by subsampling: Tight analyses via couplings and divergences,” Advances in Neural Information Processing Systems, vol. 31, 2018.
- C. Yi and J. Huang, “Semismooth newton coordinate descent algorithm for elastic-net penalized huber loss regression and quantile regression,” Journal of Computational and Graphical Statistics, vol. 26, no. 3, pp. 547–557, 2017.
- M. Su and W. Wang, “Elastic net penalized quantile regression model,” Journal of Computational and Applied Mathematics, vol. 392, p. 113462, 2021.
- S. Shalev-Shwartz, O. Shamir, N. Srebro, and K. Sridharan, “Stochastic convex optimization,” in 22nd Annual Conference on Learning Theory (COLT), vol. 2, no. 4, 2009, pp. 5–15.
- S. Neel, A. Roth, and S. Sharifi-Malvajerdi, “Descent-to-delete: Gradient-based methods for machine unlearning,” in Proceedings of the 32nd International Conference on Algorithmic Learning Theory, vol. 132, 2021, pp. 931–962.
- M. Bun and T. Steinke, “Concentrated differential privacy: Simplifications, extensions, and lower bounds,” in Theory of Cryptography Conference, vol. 9985. Springer, 2016, pp. 635–658.
- C. Dwork, W. Su, and L. Zhang, “Differentially private false discovery rate control,” Journal of Privacy and Confidentiality, vol. 11, no. 2, 2021.
- M. Redmond, “Communities and Crime Unnormalized,” UCI Machine Learning Repository, 2011, DOI: https://doi.org/10.24432/C5PC8X.