Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Sparse Least Absolute Deviation Regression with Differential Privacy (2401.01294v1)

Published 2 Jan 2024 in stat.ML, cs.LG, and stat.ME

Abstract: In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(\epsilon,\delta)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy.” Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3-4, pp. 211–407, 2014.
  2. D. Wang and J. Xu, “On sparse linear regression in the local differential privacy model,” in Proceedings of the 36th International Conference on Machine Learning, vol. 97, 2019, pp. 6628–6637.
  3. T. T. Cai, Y. Wang, and L. Zhang, “The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy,” The The Annals of Statistics, vol. 49, no. 5, pp. 2825–2850, 2021.
  4. D. Wang, A. Smith, and J. Xu, “High dimensional sparse linear regression under local differential privacy: Power and limitations,” in 2018 NIPS workshop in Privacy-Preserving Machine Learning, vol. 235, 2018.
  5. J. Ren, J. Xiong, Z. Yao, R. Ma, and M. Lin, “Dplk-means: A novel differential privacy k-means mechanism,” in 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), 2017, pp. 133–139.
  6. Y. Mülle, C. Clifton, and K. Böhm, “Privacy-integrated graph clustering through differential privacy.” in EDBT/ICDT Workshops, vol. 157, 2015.
  7. M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 308–318.
  8. T. Ha, T. K. Dang, T. T. Dang, T. A. Truong, and M. T. Nguyen, “Differential privacy in deep learning: An overview,” in 2019 International Conference on Advanced Computing and Applications (ACOMP), 2019, pp. 97–102.
  9. Y. Zhu, X. Yu, M. Chandraker, and Y.-X. Wang, “Private-knn: Practical differential privacy for computer vision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 854–11 862.
  10. J. L. Horowitz, “Bootstrap methods for median regression models,” Econometrica, vol. 66, no. 6, pp. 1327–1351, 1998.
  11. B. Jayaraman, L. Wang, D. Evans, and Q. Gu, “Distributed learning without distress: Privacy-preserving empirical risk minimization,” in Advances in Neural Information Processing Systems, vol. 31, 2018.
  12. J. Zhang, K. Zheng, W. Mou, and L. Wang, “Efficient private erm for smooth objectives,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017, pp. 3922–3928.
  13. L. Wang, B. Jayaraman, D. Evans, and Q. Gu, “Efficient privacy-preserving stochastic nonconvex optimization,” in Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, vol. 216, 2023, pp. 2203–2213.
  14. X. Zhang, M. Fang, J. Liu, and Z. Zhu, “Private and communication-efficient edge learning: a sparse differential gaussian-masking distributed sgd approach,” in Proceedings of the Twenty-First International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing, 2020, pp. 261–270.
  15. H. Wang and C. Li, “Distributed quantile regression over sensor networks,” IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 2, pp. 338–348, 2017.
  16. H. Wang, L. Xia, and C. Li, “Distributed online quantile regression over networks with quantized communication,” Signal Processing, vol. 157, pp. 141–150, 2019.
  17. Y. Tian, M. Tian, and Q. Zhu, “Linear quantile regression based on em algorithm,” Communications in Statistics-Theory and Methods, vol. 43, no. 16, pp. 3464–3484, 2014.
  18. Y. Tian, Q. Zhu, and M. Tian, “Estimation of linear composite quantile regression using em algorithm,” Statistics & Probability Letters, vol. 117, pp. 183–191, 2016.
  19. Y.-h. Zhou, Z.-x. Ni, and Y. Li, “Quantile regression via the em algorithm,” Communications in Statistics-Simulation and Computation, vol. 43, no. 10, pp. 2162–2172, 2014.
  20. L. Yu and N. Lin, “Admm for penalized quantile regression in big data,” International Statistical Review, vol. 85, no. 3, pp. 494–518, 2017.
  21. Y. Gu, J. Fan, L. Kong, S. Ma, and H. Zou, “Admm for high-dimensional sparse penalized quantile regression,” Technometrics, vol. 60, no. 3, pp. 319–331, 2018.
  22. X. He, X. Pan, K. M. Tan, and W.-X. Zhou, “Smoothed quantile regression with large-scale inference,” Journal of Econometrics, vol. 232, no. 2, pp. 367–388, 2023.
  23. P. J. Huber, “Robust statistics,” in International Encyclopedia of Statistical Science.   Springer, 2011, pp. 1248–1251.
  24. C. Yu and W. Yao, “Robust linear regression: A review and comparison,” Communications in Statistics-Simulation and Computation, vol. 46, no. 8, pp. 6261–6282, 2017.
  25. A. F. Siegel, “Robust regression using repeated medians,” Biometrika, vol. 69, no. 1, pp. 242–244, 1982.
  26. P. J. Rousseeuw and C. Croux, “Alternatives to the median absolute deviation,” Journal of the American Statistical Association, vol. 88, no. 424, pp. 1273–1283, 1993.
  27. P. Rousseeuw and V. Yohai, “Robust regression by means of s-estimators,” in Robust and Nonlinear Time Series Analysis.   Springer, 1984, pp. 256–272.
  28. Y. Li and J. Zhu, “ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-norm quantile regression,” Journal of Computational and Graphical Statistics, vol. 17, no. 1, pp. 163–185, 2008.
  29. A. Belloni and V. Chernozhukov, “ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-penalized quantile regression in high-dimensional sparse models,” The Annals of Statistics, vol. 39, no. 1, pp. 82–130, 2011.
  30. L. Wang, Y. Wu, and R. Li, “Quantile regression for analyzing heterogeneity in ultra-high dimension,” Journal of the American Statistical Association, vol. 107, no. 497, pp. 214–222, 2012.
  31. Q. Zheng, L. Peng, and X. He, “High dimensional censored quantile regression,” The Annals of Statistics, vol. 46, no. 1, pp. 308–343, 2018.
  32. L. Wang, “The ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT penalized lad estimator for high dimensional linear regression,” Journal of Multivariate Analysis, vol. 120, pp. 135–151, 2013.
  33. Y. Wu and Y. Liu, “Variable selection in quantile regression,” Statistica Sinica, vol. 19, no. 2, pp. 801–817, 2009.
  34. J. Fan, Y. Fan, and E. Barut, “Adaptive robust variable selection,” The Annals of Statistics, vol. 42, no. 1, pp. 324–351, 2014.
  35. J. Fan, L. Xue, and H. Zou, “Strong oracle optimality of folded concave penalized estimation,” The Annals of Statistics, vol. 42, no. 3, pp. 819–849, 2014.
  36. R. Koenker and P. Ng, “A frisch-newton algorithm for sparse quantile regression,” Acta Mathematicae Applicatae Sinica, vol. 21, no. 2, pp. 225–236, 2005.
  37. T. T. Wu and K. Lange, “Coordinate descent algorithms for lasso penalized regression,” The Annals of Applied Statistics, vol. 2, no. 1, pp. 224–244, 2008.
  38. B. Peng and L. Wang, “An iterative coordinate descent algorithm for high-dimensional nonconvex penalized quantile regression,” Journal of Computational and Graphical Statistics, vol. 24, no. 3, pp. 676–694, 2015.
  39. D. R. Hunter and K. Lange, “Quantile regression via an mm algorithm,” Journal of Computational and Graphical Statistics, vol. 9, no. 1, pp. 60–77, 2000.
  40. J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Minimax optimal procedures for locally private estimation,” Journal of the American Statistical Association, vol. 113, no. 521, pp. 182–201, 2018.
  41. Q. Zheng, S. Chen, Q. Long, and W. Su, “Federated f-differential privacy,” in Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, vol. 130.   PMLR, 2021, pp. 2251–2259.
  42. K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, “Differentially private empirical risk minimization.” Journal of Machine Learning Research, vol. 12, no. 3, pp. 1069–1109, 2011.
  43. L. Wang and Q. Gu, “Differentially private iterative gradient hard thresholding for sparse learning,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019, pp. 3740–3747.
  44. D. Wang and J. Xu, “Differentially private ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-norm linear regression with heavy-tailed data,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 1856–1861.
  45. L. Hu, S. Ni, H. Xiao, and D. Wang, “High dimensional differentially private stochastic optimization with heavy-tailed data,” in Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2022, pp. 227–236.
  46. B. Balle, G. Barthe, and M. Gaboardi, “Privacy amplification by subsampling: Tight analyses via couplings and divergences,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  47. C. Yi and J. Huang, “Semismooth newton coordinate descent algorithm for elastic-net penalized huber loss regression and quantile regression,” Journal of Computational and Graphical Statistics, vol. 26, no. 3, pp. 547–557, 2017.
  48. M. Su and W. Wang, “Elastic net penalized quantile regression model,” Journal of Computational and Applied Mathematics, vol. 392, p. 113462, 2021.
  49. S. Shalev-Shwartz, O. Shamir, N. Srebro, and K. Sridharan, “Stochastic convex optimization,” in 22nd Annual Conference on Learning Theory (COLT), vol. 2, no. 4, 2009, pp. 5–15.
  50. S. Neel, A. Roth, and S. Sharifi-Malvajerdi, “Descent-to-delete: Gradient-based methods for machine unlearning,” in Proceedings of the 32nd International Conference on Algorithmic Learning Theory, vol. 132, 2021, pp. 931–962.
  51. M. Bun and T. Steinke, “Concentrated differential privacy: Simplifications, extensions, and lower bounds,” in Theory of Cryptography Conference, vol. 9985.   Springer, 2016, pp. 635–658.
  52. C. Dwork, W. Su, and L. Zhang, “Differentially private false discovery rate control,” Journal of Privacy and Confidentiality, vol. 11, no. 2, 2021.
  53. M. Redmond, “Communities and Crime Unnormalized,” UCI Machine Learning Repository, 2011, DOI: https://doi.org/10.24432/C5PC8X.
Citations (2)

Summary

We haven't generated a summary for this paper yet.