AIJack: Let's Hijack AI! Security and Privacy Risk Simulator for Machine Learning (2312.17667v2)
Abstract: This paper introduces AIJack, an open-source library designed to assess security and privacy risks associated with the training and deployment of machine learning models. Amid the growing interest in big data and AI, advancements in machine learning research and business are accelerating. However, recent studies reveal potential threats, such as the theft of training data and the manipulation of models by malicious attackers. Therefore, a comprehensive understanding of machine learning's security and privacy vulnerabilities is crucial for the safe integration of machine learning into real-world products. AIJack aims to address this need by providing a library with various attack and defense methods through a unified API. The library is publicly available on GitHub (https://github.com/Koukyosyumei/AIJack).
- Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part III 13, pages 387–402. Springer, 2013.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389, 2012.
- Certified robustness to adversarial examples with differential privacy. In 2019 IEEE symposium on security and privacy (SP), pages 656–672. IEEE, 2019.
- {{\{{Cost-Aware}}\}} robust tree ensembles for security applications. In 30th USENIX Security Symposium (USENIX Security 21), pages 2291–2308, 2021.
- Model assertions for debugging machine learning. 2018.
- Complaint-driven training data debugging for query 2.0. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 1317–1334, 2020.
- Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP ’17, page 1â18, New York, NY, USA, 2017. Association for Computing Machinery.
- Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, page 1322â1333, New York, NY, USA, 2015. Association for Computing Machinery.
- Deep leakage from gradients. Advances in neural information processing systems, 32, 2019.
- Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3–18. IEEE, 2017.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
- Mondrian multidimensional k-anonymity. In 22nd International Conference on Data Engineering (ICDE’06), pages 25–25, 2006.
- Pascal Paillier. Public-key cryptosystems based on composite degree residuosity classes. In International conference on the theory and applications of cryptographic techniques, pages 223–238. Springer, 1999.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564, 2018.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
- Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2:429–450, 2020.
- Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581, 2019.
- Fedgems: Federated learning of larger server models via selective knowledge fusion. arXiv preprint arXiv:2110.11027, 2021.
- Distillation-based semi-supervised federated learning for communication-efficient collaborative training with non-iid private data. IEEE Transactions on Mobile Computing, 22(1):191–205, 2021.
- Model-contrastive federated learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10713–10722, 2021.
- Fedexp: Speeding up federated averaging via extrapolation. arXiv preprint arXiv:2301.09604, 2023.
- Secureboost: A lossless federated learning framework. IEEE Intelligent Systems, 36(6):87–98, 2021.
- idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020.
- Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems, 33:16937–16947, 2020.
- A framework for evaluating gradient leakage attacks in federated learning. arXiv preprint arXiv:2004.10397, 2020.
- See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16337–16346, 2021.
- Deep models under the gan: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 603–618, 2017.
- Label leakage and protection in two-party split learning. arXiv preprint arXiv:2102.08504, 2021.
- Mpaf: Model poisoning attacks to federated learning based on fake clients. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3396–3404, 2022.
- Dba: Distributed backdoor attacks against federated learning. In International Conference on Learning Representations, 2020.
- How to backdoor federated learning. In Silvia Chiappa and Roberto Calandra, editors, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, pages 2938–2948. PMLR, 26–28 Aug 2020.
- Free-riders in federated learning: Attacks and defenses. arXiv preprint arXiv:1911.12560, 2019.
- A tale of two models: Constructing evasive attacks on edge models. Proceedings of Machine Learning and Systems, 4:414–429, 2022.
- Private adaptive optimization with side information. In International Conference on Machine Learning, pages 13086–13105. PMLR, 2022.
- Dplis: Boosting utility of differentially private deep learning via randomized smoothing. arXiv preprint arXiv:2103.01496, 2021.
- Provable defense against privacy leakage in federated learning from representation perspective. arXiv preprint arXiv:2012.06043, 2020.
- Mitigating sybils in federated learning poisoning. arXiv preprint arXiv:1808.04866, 2018.
- Improving robustness to model inversion attacks via mutual information regularization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11666–11673, 2021.
- Sparse communication for distributed gradient descent. In Martha Palmer, Rebecca Hwa, and Sebastian Riedel, editors, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 440–445, Copenhagen, Denmark, September 2017. Association for Computational Linguistics.
- Opacus: User-friendly differential privacy library in pytorch. arXiv preprint arXiv:2109.12298, 2021.