FedD2S: Personalized Data-Free Federated Knowledge Distillation (2402.10846v1)
Abstract: This paper addresses the challenge of mitigating data heterogeneity among clients within a Federated Learning (FL) framework. The model-drift issue, arising from the noniid nature of client data, often results in suboptimal personalization of a global model compared to locally trained models for each client. To tackle this challenge, we propose a novel approach named FedD2S for Personalized Federated Learning (pFL), leveraging knowledge distillation. FedD2S incorporates a deep-to-shallow layer-dropping mechanism in the data-free knowledge distillation process to enhance local model personalization. Through extensive simulations on diverse image datasets-FEMNIST, CIFAR10, CINIC0, and CIFAR100-we compare FedD2S with state-of-the-art FL baselines. The proposed approach demonstrates superior performance, characterized by accelerated convergence and improved fairness among clients. The introduced layer-dropping technique effectively captures personalized knowledge, resulting in enhanced performance compared to alternative FL models. Moreover, we investigate the impact of key hyperparameters, such as the participation ratio and layer-dropping rate, providing valuable insights into the optimal configuration for FedD2S. The findings demonstrate the efficacy of adaptive layer-dropping in the knowledge distillation process to achieve enhanced personalization and performance across diverse datasets and tasks.
- Konečnỳ, Jakub and McMahan, H Brendan and Yu, Felix X and Richtárik, Peter and Suresh, Ananda Theertha and Bacon, Dave, “Federated learning: Strategies for improving communication efficiency,” in arXiv preprint arXiv:1610.05492, 2016.
- Fu, Lei and Zhang, Huanle and Gao, Ge and Zhang, Mi and Liu, Xin, “Client selection in federated learning: Principles, challenges, and opportunities,” in IEEE Internet of Things Journal, 2023.
- Smith, Virginia and Chiang, Chao-Kai and Sanjabi, Maziar and Talwalkar, Ameet S, “Federated multi-task learning,” in Advances in neural information processing systems, vol. 30, 2017.
- Jin, Hai and Bai, Dongshan and Yao, Dezhong and Dai, Yutong and Gu, Lin and Yu, Chen and Sun, Lichao, “Personalized edge intelligence via federated self-knowledge distillation,” IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 2, pp. 567–580, 2022.
- Jeon, Eun Som and Choi, Hongjun and Shukla, Ankita and Turaga, Pavan, “Leveraging angular distributions for improved knowledge distillation,” Neurocomputing, vol. 518, pp. 466–481, 2023.
- Lopes, Raphael Gontijo and Fenu, Stefano and Starner, Thad, “Data-free knowledge distillation for deep neural networks,” in arXiv preprint arXiv:1710.07535, 2017.
- Chen, Hanting and Wang, Yunhe and Xu, Chang and Yang, Zhaohui and Liu, Chuanjian and Shi, Boxin and Xu, Chunjing and Xu, Chao and Tian, Qi, “Data-free learning of student networks,” Proceedings of the IEEE/CVF international conference on computer vision, pp. 3514–3522, 2019.
- Li, Yuhang and Zhu, Feng and Gong, Ruihao and Shen, Mingzhu and Dong, Xin and Yu, Fengwei and Lu, Shaoqing and Gu, Shi, “Mixmix: All you need for data-free compression are feature and data mixing,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4410–4419, 2021.
- Fang, Gongfan and Song, Jie and Shen, Chengchao and Wang, Xinchao and Chen, Da and Song, Mingli, “Data-free adversarial distillation,” arXiv preprint arXiv:1912.11006, 2019.
- Zhang, Zhenyuan and Shen, Tao and Zhang, Jie and Wu, Chao, “Feddtg: Federated data-free knowledge distillation via three-player generative adversarial networks,” arXiv preprint arXiv:2201.03169, 2022.
- Venkateswaran, Praveen and Isahagian, Vatche and Muthusamy, Vinod and Venkatasubramanian, Nalini, “Fedgen: Generalizable federated learning,” arXiv preprint arXiv:2211.01914, 2022.
- Zhang, Lan and Wu, Dapeng and Yuan, Xiaoyong, “Fedzkt: Zero-shot knowledge transfer towards resource-constrained federated learning with heterogeneous on-device models,” 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS), pp. 928–938, 2022.
- Wu, Zhiyuan and Sun, Sheng and Wang, Yuwei and Liu, Min and Pan, Quyang and Jiang, Xuefeng and Gao, Bo, “FedICT: Federated Multi-task Distillation for Multi-access Edge Computing,” IEEE Transactions on Parallel and Distributed Systems, 2023.
- Ying Zhang, Tao Xiang, Timothy M Hospedales, Huchuan Lu, “Deep mutual learning,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp.4320–4328, 2018.
- Arivazhagan, Manoj Ghuhan and Aggarwal, Vinay and Singh, Aaditya Kumar and Choudhary, Sunav, “Federated learning with personalization layers,” arXiv preprint arXiv:1912.00818, 2019.
- ACollins, Liam and Hassani, Hamed and Mokhtari, Aryan and Shakkottai, Sanjay, “Exploiting shared representations for personalized federated learning,” International conference on machine learning, pp. 2089–2099, 2021.
- Romero, Adriana and Ballas, Nicolas and Kahou, Samira Ebrahimi and Chassang, Antoine and Gatta, Carlo and Bengio, Yoshua, “Fitnets: Hints for thin deep nets” arXiv preprint arXiv:1412.6550, 2014.
- Chen, Defang and Mei, Jian-Ping and Zhang, Yuan and Wang, Can and Wang, Zhe and Feng, Yan and Chen, Chun, “Cross-layer distillation with semantic calibration” arXiv preprint arXiv:1412.6550, vol. 35, pp. 7028–7036, 2021.
- Ahn, Sungsoo and Hu, Shell Xu and Damianou, Andreas and Lawrence, Neil D and Dai, Zhenwen, “Variational information distillation for knowledge transfer” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, vol. 35, pp. 9163–9171, 2019.
- Li, Tian and Sahu, Anit Kumar and Zaheer, Manzil and Sanjabi, Maziar and Talwalkar, Ameet and Smith, Virginia, “Federated optimization in heterogeneous networks,” in Proceedings of Machine learning and systems, vol. 2, pp. 429–450, 2020.
- Hinton, Geoffrey and Vinyals, Oriol and Dean, Jeff, “Distilling the knowledge in a neural network,” in arXiv preprint arXiv:1503.02531, 2015.
- Li, Daliang and Wang, Junpu, “Fedmd: Heterogenous federated learning via model distillation,” in arXiv preprint arXiv:1910.03581, 2019.
- He, Chaoyang and Annavaram, Murali and Avestimehr, Salman, “Group knowledge transfer: Federated learning of large cnns at the edge,” Advances in Neural Information Processing Systems, vol. 33, pp. 14068–14080, 2020.
- Zeiler, Matthew D and Fergus, Rob, “Visualizing and understanding convolutional networks” Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pp. 818–833, 2014.
- Chen, Huancheng and Vikalo, Haris and others, “The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation” arXiv preprint arXiv:2301.08968, 2023.
- Tan, Yue and Long, Guodong and Liu, Lu and Zhou, Tianyi and Lu, Qinghua and Jiang, Jing and Zhang, Chengqi, “Fedproto: Federated prototype learning across heterogeneous clients” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 8432–8440, 2022.
- Caldas, Sebastian and Duddu, Sai Meher Karthik and Wu, Peter and Li, Tian and Konečnỳ, Jakub and McMahan, H Brendan and Smith, Virginia and Talwalkar, Ameet, “Leaf: A benchmark for federated settings” arXiv preprint arXiv:1812.01097, 2018.
- Darlow, Luke N and Crowley, Elliot J and Antoniou, Antreas and Storkey, Amos J, “Cinic-10 is not imagenet or cifar-10” arXiv preprint arXiv:1810.03505, 2018.
- Krizhevsky, Alex and Hinton, Geoffrey and others, “Learning multiple layers of features from tiny images” Toronto, ON, Canada, 2009.