FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated Learning (2405.11811v1)
Abstract: Federated learning (FL) has emerged as a prominent approach for collaborative training of machine learning models across distributed clients while preserving data privacy. However, the quest to balance acceleration and stability becomes a significant challenge in FL, especially on the client-side. In this paper, we introduce FedCAda, an innovative federated client adaptive algorithm designed to tackle this challenge. FedCAda leverages the Adam algorithm to adjust the correction process of the first moment estimate $m$ and the second moment estimate $v$ on the client-side and aggregate adaptive algorithm parameters on the server-side, aiming to accelerate convergence speed and communication efficiency while ensuring stability and performance. Additionally, we investigate several algorithms incorporating different adjustment functions. This comparative analysis revealed that due to the limited information contained within client models from other clients during the initial stages of federated learning, more substantial constraints need to be imposed on the parameters of the adaptive algorithm. As federated learning progresses and clients gather more global information, FedCAda gradually diminishes the impact on adaptive parameters. These findings provide insights for enhancing the robustness and efficiency of algorithmic improvements. Through extensive experiments on computer vision (CV) and NLP datasets, we demonstrate that FedCAda outperforms the state-of-the-art methods in terms of adaptability, convergence, stability, and overall performance. This work contributes to adaptive algorithms for federated learning, encouraging further exploration.
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- Improving generalization in federated learning by seeking flat minima. In European Conference on Computer Vision. Springer, 654–672.
- Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097 (2018).
- On the convergence of decentralized adaptive gradient methods. In Asian Conference on Machine Learning. PMLR, 217–232.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
- Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research 12, 7 (2011).
- Generative adversarial networks. Commun. ACM 63, 11 (2020), 139–144.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
- FedDyn: A dynamic and efficient federated distillation approach on Recommender System. In 2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 786–793.
- Advances and open problems in federated learning. Foundations and trends® in machine learning 14, 1–2 (2021), 1–210.
- Scaffold: Stochastic controlled averaging for federated learning. In International conference on machine learning. PMLR, 5132–5143.
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Similarity of neural network representations revisited. In International conference on machine learning. PMLR, 3519–3529.
- Learning multiple layers of features from tiny images. (2009).
- Model-contrastive federated learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10713–10722.
- Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine 37, 3 (2020), 50–60.
- Revisiting weighted aggregation in federated learning with neural networks. In International Conference on Machine Learning. PMLR, 19767–19788.
- Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems 33 (2020), 2351–2363.
- Fedcd: A classifier debiased federated learning framework for non-iid data. In Proceedings of the 31st ACM International Conference on Multimedia. 8994–9002.
- Federated Learning with Label-Masking Distillation. In Proceedings of the 31st ACM International Conference on Multimedia. 222–232.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273–1282.
- Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes. arXiv preprint arXiv:2307.06306 (2023).
- Where to begin? exploring the impact of pre-training and initialization in federated learning. arXiv preprint arXiv:2206.15387 4 (2022).
- Cross-Silo Prototypical Calibration for Federated Learning with Non-IID Data. In Proceedings of the 31st ACM International Conference on Multimedia. 3099–3107.
- Generalized federated learning via sharpness aware minimization. In International conference on machine learning. PMLR, 18250–18280.
- Adaptive federated optimization. arXiv preprint arXiv:2003.00295 (2020).
- On the convergence of adam and beyond. arXiv preprint arXiv:1904.09237 (2019).
- Tijmen Tieleman and G Hinton. 2017. Divide the gradient by a running average of its recent magnitude. coursera: Neural networks for machine learning. Technical report (2017).
- Communication-efficient adaptive federated learning. In International Conference on Machine Learning. PMLR, 22802–22838.
- Faster adaptive federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 10379–10387.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).
- Bayesian nonparametric federated learning of neural networks. In International conference on machine learning. PMLR, 7252–7261.
- Improving Federated Person Re-Identification through Feature-Aware Proximity and Aggregation. In Proceedings of the 31st ACM International Conference on Multimedia. 2498–2506.
- Cuing without sharing: A federated cued speech recognition framework via mutual knowledge distillation. In Proceedings of the 31st ACM International Conference on Multimedia. 8781–8789.
- Federated learning with non-iid data. arXiv preprint arXiv:1806.00582 (2018).
- Liuzhi Zhou (2 papers)
- Yu He (106 papers)
- Kun Zhai (11 papers)
- Xiang Liu (475 papers)
- Sen Liu (35 papers)
- Xingjun Ma (114 papers)
- Guangnan Ye (17 papers)
- Yu-Gang Jiang (223 papers)
- Hongfeng Chai (7 papers)