Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation (2401.14211v3)
Abstract: Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices while preserving data privacy. Despite its potential benefits, FL is hindered by excessive communication costs due to repeated server-client communication during training. To address this challenge, model compression techniques, such as sparsification and weight clustering are applied, which often require modifying the underlying model aggregation schemes or involve cumbersome hyperparameter tuning, with the latter not only adjusts the model's compression rate but also limits model's potential for continuous improvement over growing data. In this paper, we propose FedCompress, a novel approach that combines dynamic weight clustering and server-side knowledge distillation to reduce communication costs while learning highly generalizable models. Through a comprehensive evaluation on diverse public datasets, we demonstrate the efficacy of our approach compared to baselines in terms of communication costs and inference speed.
- The augmented image prior: Distilling 1000 classes by extrapolating from a single image. In The Eleventh International Conference on Learning Representations, 2023.
- Learning to see by looking at noise. Advances in Neural Information Processing Systems, 34:2556–2569, 2021.
- Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390, 2020.
- Federated learning with hierarchical clustering of local updates to improve training on non-iid data. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–9. IEEE, 2020.
- Slashing communication traffic in federated learning by transmitting clustered model updates. IEEE Journal on Selected Areas in Communications, 39(8):2572–2589, 2021.
- Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. 2015. doi: 10.48550/ARXIV.1510.00149. URL https://arxiv.org/abs/1510.00149.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
- Federated learning: Strategies for improving communication efficiency. In NIPS Workshop on Private Multi-Party Machine Learning, 2016. URL https://arxiv.org/abs/1610.05492.
- Learning multiple layers of features from tiny images. 2009.
- Ken MacLean. Voxforge. Ken MacLean.[Online]. Available: http://www.voxforge.org/home.[Acedido em 2012], 2018.
- Fedzip: A compression framework for communication-efficient federated learning. arXiv preprint arXiv:2102.01593, 2021.
- Communication-Efficient Learning of Deep Networks from Decentralized Data. In Aarti Singh and Jerry Zhu, editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pages 1273–1282. PMLR, 20–22 Apr 2017. URL https://proceedings.mlr.press/v54/mcmahan17a.html.
- Federated learning of large models at the edge via principal sub-model training, 2023.
- Librispeech: An asr corpus based on public domain audio books. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5206–5210, 2015. doi: 10.1109/ICASSP.2015.7178964.
- Comfetch: Federated learning of large networks on constrained clients via sketching, 2023.
- The effective rank: A measure of effective dimensionality. In 2007 15th European Signal Processing Conference, pages 606–610, 2007.
- Sparsified sgd with memory. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 4452–4463, Red Hook, NY, USA, 2018. Curran Associates Inc.
- Federated progressive sparsification (purge, merge, tune)+. arXiv preprint arXiv:2204.12430, 2022.
- Federated learning with noisy labels. arXiv preprint arXiv:2208.09378, 2022. doi: 10.48550/ARXIV.2208.09378.
- P. Warden. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. ArXiv e-prints, apr 2018. URL https://arxiv.org/abs/1804.03209.
- Communication-efficient federated learning via knowledge distillation. Nature Communications, 13(1), April 2022. ISSN 2041-1723. doi: 10.1038/s41467-022-29763-x. URL http://dx.doi.org/10.1038/s41467-022-29763-x.
- Medmnist v2: A large-scale lightweight benchmark for 2d and 3d biomedical image classification. arXiv preprint arXiv:2110.14795, 2021.
- Vasileios Tsouvalas (10 papers)
- Aaqib Saeed (36 papers)
- Tanir Ozcelebi (14 papers)
- Nirvana Meratnia (9 papers)