On the Robustness of LDP Protocols for Numerical Attributes under Data Poisoning Attacks (2403.19510v3)
Abstract: Recent studies reveal that local differential privacy (LDP) protocols are vulnerable to data poisoning attacks where an attacker can manipulate the final estimate on the server by leveraging the characteristics of LDP and sending carefully crafted data from a small fraction of controlled local clients. This vulnerability raises concerns regarding the robustness and reliability of LDP in hostile environments. In this paper, we conduct a systematic investigation of the robustness of state-of-the-art LDP protocols for numerical attributes, i.e., categorical frequency oracles (CFOs) with binning and consistency, and distribution reconstruction. We evaluate protocol robustness through an attack-driven approach and propose new metrics for cross-protocol attack gain measurement. The results indicate that Square Wave and CFO-based protocols in the Server setting are more robust against the attack compared to the CFO-based protocols in the User setting. Our evaluation also unfolds new relationships between LDP security and its inherent design choices. We found that the hash domain size in local-hashing-based LDP has a profound impact on protocol robustness beyond the well-known effect on utility. Further, we propose a zero-shot attack detection by leveraging the rich reconstructed distribution information. The experiment show that our detection significantly improves the existing methods and effectively identifies data manipulation in challenging scenarios.
- J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE, 2013, pp. 429–438.
- T. Wang, J. Blocki, N. Li, and S. Jha, “Locally differentially private protocols for frequency estimation,” in 26th USENIX Security Symposium (USENIX Security 17), 2017, pp. 729–745.
- Z. Li, T. Wang, M. Lopuhaä-Zwakenberg, N. Li, and B. Škoric, “Estimating numerical distributions under local differential privacy,” in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, pp. 621–635.
- J. Imola, T. Murakami, and K. Chaudhuri, “Locally differentially private analysis of graph statistics,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 983–1000.
- Ú. Erlingsson, V. Pihur, and A. Korolova, “Rappor: Randomized aggregatable privacy-preserving ordinal response,” in Proceedings of the 2014 ACM SIGSAC CCS, 2014, pp. 1054–1067.
- A. D. P. Team, “Learning with privacy at scale,” 2017.
- B. Ding, J. Kulkarni, and S. Yekhanin, “Collecting telemetry data privately,” Advances in Neural Information Processing Systems, vol. 30, 2017.
- X. Cao, J. Jia, and N. Z. Gong, “Data poisoning attacks to local differential privacy protocols,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 947–964.
- A. Cheu, A. Smith, and J. Ullman, “Manipulation attacks in local differential privacy,” in 2021 IEEE Symposium on Security and Privacy (IEEE S&P). IEEE, 2021, pp. 883–900.
- Y. Wu, X. Cao, J. Jia, and N. Z. Gong, “Poisoning attacks to local differential privacy protocols for key-value data,” in 31th USENIX Security Symposium (USENIX Security 22), 2022.
- X. Li, N. Z. Gong, N. Li, W. Sun, and H. Li, “Fine-grained poisoning attacks to local differential privacy protocols for mean and variance estimation,” in 32th USENIX Security Symposium (USENIX Security 23), 2023.
- Y. Collet, “xxhash: Extremely fast hash algorithm,” https://github.com/Cyan4973/xxHash, 2016.
- T. Wang, M. Lopuhaa-Zwakenberg, Z. Li, B. Skoric, and N. Li, “Locally differentially private frequency estimation with consistency,” in NDSS’20: Proceedings of the NDSS Symposium, 2020.
- K. Huang, G. Ouyang, Q. Ye, H. Hu, B. Zheng, X. Zhao, R. Zhang, and X. Zhou, “Ldpguard: Defenses against data poisoning attacks to local differential privacy protocols,” IEEE Transactions on Knowledge and Data Engineering, 2024.
- X. Sun, Q. Ye, H. Hu, J. Duan, T. Wo, J. Xu, and R. Yang, “Ldprecover: Recovering frequencies from poisoning attacks against local differential privacy,” in ICDE, 2024.
- Y. Yan, Q. Ye, H. Hu, R. Chen, Q. Han, and L. Wang, “Towards defending against byzantine ldp amplified gain attacks,” in International Conference on Database Systems for Advanced Applications. Springer, 2023, pp. 627–643.
- NYC Taxi and Limousine Commission, “TLC trip record data,” https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page, 2018.
- S. Office., “SF employee compensation,” https://www.kaggle.com/san-francisco/sf-employee-compensation#employee-compensation.csv, 2019.
- Y. Xian, C. H. Lampert, B. Schiele, and Z. Akata, “Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 9, pp. 2251–2265, 2018.
- F. J. Massey Jr, “The kolmogorov-smirnov test for goodness of fit,” Journal of the American statistical Association, vol. 46, no. 253, pp. 68–78, 1951.
- M. Abramowitz, I. A. Stegun, and R. H. Romer, “Handbook of mathematical functions with formulas, graphs, and mathematical tables,” 1988.
- T. Fawcett, “An introduction to roc analysis,” Pattern recognition letters, vol. 27, no. 8, pp. 861–874, 2006.
- B. Balle, J. Bell, A. Gascón, and K. Nissim, “The privacy blanket of the shuffle model,” in Annual International Cryptology Conference. Springer, 2019, pp. 638–667.
- T. Wang, B. Ding, M. Xu, Z. Huang, C. Hong, J. Zhou, N. Li, and S. Jha, “Improving utility and security of the shuffler-based differential privacy,” Proceedings of the VLDB Endowment, vol. 13, no. 13, 2019.
- A. Cheu and M. Zhilyaev, “Differentially private histograms in the shuffle model from fake users,” in 2021 IEEE Symposium on Security and Privacy (S&P). IEEE, 2021.
- Q. Ye, H. Hu, M. H. Au, X. Meng, and X. Xiao, “Lf-gdpr: A framework for estimating graph metrics with local differential privacy,” IEEE Transactions on Knowledge and Data Engineering, 2020.
- Z. Qin, T. Yu, Y. Yang, I. Khalil, X. Xiao, and K. Ren, “Generating synthetic decentralized social graphs with local differential privacy,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 425–438.