KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning (2404.12369v1)
Abstract: Vertical Federated Learning (VFL) is a category of Federated Learning in which models are trained collaboratively among parties with vertically partitioned data. Typically, in a VFL scenario, the labels of the samples are kept private from all the parties except for the aggregating server, that is the label owner. Nevertheless, recent works discovered that by exploiting gradient information returned by the server to bottom models, with the knowledge of only a small set of auxiliary labels on a very limited subset of training data points, an adversary can infer the private labels. These attacks are known as label inference attacks in VFL. In our work, we propose a novel framework called KDk, that combines Knowledge Distillation and k-anonymity to provide a defense mechanism against potential label inference attacks in a VFL scenario. Through an exhaustive experimental campaign we demonstrate that by applying our approach, the performance of the analyzed label inference attacks decreases consistently, even by more than 60%, maintaining the accuracy of the whole VFL almost unaltered.
- signSGD: Compressed optimisation for non-convex problems. In International Conference on Machine Learning. PMLR, Vienna, Austria, 560–569.
- Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. Association for Computing Machinery (ACM), Beijing, China, 535–541.
- Jang Hyun Cho and Bharath Hariharan. 2019. On the efficacy of knowledge distillation. In Proceedings of the IEEE/CVF international conference on computer vision. IEEE, Seoul, South Korea, 4794–4802.
- Criteo. 2024. Criteo AI Lab. https://ailab.criteo.com/ressources.
- CINIC-10 is not ImageNet or CIFAR-10. arXiv:1810.03505 [cs.CV]
- Bert: Pre-training of deep bidirectional transformers for language understanding.
- FLSG: A Novel Defense Strategy Against Inference Attacks in Vertical Federated Learning. IEEE Internet of Things Journal 11 (2023), 1816–1826.
- Label inference attacks against vertical federated learning. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, USA, 1397–1414.
- Knowledge distillation: A survey. International Journal of Computer Vision 129 (2021), 1789–1819.
- Distilling the Knowledge in a Neural Network. arXiv:1503.02531 [stat.ML]
- Parallelization of the self-organized maps algorithm for federated learning on distributed sources. The Journal of Supercomputing 77 (2021), 6197–6213.
- Learning multiple layers of features from tiny images. University of Toronto, Toronto, ON, Canada.
- Label Leakage and Protection in Two-party Split Learning. arXiv:2102.08504 [cs.LG]
- Junlin Liu and Xinchen Lyu. 2022. Clustering label inference attack against practical split learning. arXiv:2203.05222 [cs.LG]
- Batch Label Inference and Replacement Attacks in Black-Boxed Vertical Federated Learning. arXiv:2112.05409 [cs.LG]
- Feature inference attack on model predictions in vertical federated learning. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, Chania, Greece, 181–192.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, Ft. Lauderdale, FL, USA, 1273–1282.
- Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE symposium on security and privacy (SP). IEEE, San Francisco, CA, USA, 691–706.
- Comprehensive privacy analysis of deep learning. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP). IEEE, San Francisco, CA, USA, 1–15.
- Differentiable top-k classification learning. In International Conference on Machine Learning. PMLR, Baltimore, MD, 17656–17668.
- Friedrich Pukelsheim. 1994. The three sigma rule. The American Statistician 48, 2 (1994), 88–91.
- Pierangela Samarati and Latanya Sweeney. 1998. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression.
- Label Leakage and Protection from Forward Embedding in Vertical Federated Learning. arXiv:2203.01451 [cs.LG]
- Sean Vucinich and Qiang Zhu. 2023. The Current State and Challenges of Fairness in Federated Learning. IEEE Access 11 (2023), 80903–80914.
- Vertical Federated Learning: Challenges, Methodologies and Experiments. arXiv:2202.04309 [cs.LG]
- A survey of transfer learning. Journal of Big data 3, 1 (2016), 1–40.
- Cascade Vertical Federated Learning Towards Straggler Mitigation and Label Privacy over Distributed Labels. IEEE Transactions on Big Data 1 (2023), 1–14.
- Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students. arXiv:1805.05551 [cs.CV]
- Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.
- A survey on federated learning. Knowledge-Based Systems 216 (2021), 106775.
- Character-level convolutional networks for text classification. Advances in neural information processing systems 28 (2015), 649––657.
- Deep leakage from gradients. Advances in neural information processing systems 32 (2019), 14747–14756.