FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy (2302.10429v2)
Abstract: Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections. Its performance suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting. In this paper, we propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems. Concretely, FedSpeed applies the prox-correction term on the current local updates to efficiently reduce the biases introduced by the prox-term, a necessary regularizer to maintain the strong local consistency. Furthermore, FedSpeed merges the vanilla stochastic gradient with a perturbation computed from an extra gradient ascent step in the neighborhood, thereby alleviating the issue of local over-fitting. Our theoretical analysis indicates that the convergence rate is related to both the communication rounds $T$ and local intervals $K$ with a upper bound $\small \mathcal{O}(1/T)$ if setting a proper local interval. Moreover, we conduct extensive experiments on the real-world dataset to demonstrate the efficiency of our proposed FedSpeed, which performs significantly faster and achieves the state-of-the-art (SOTA) performance on the general FL experimental settings than several baselines. Our code is available at \url{https://github.com/woodenchild95/FL-Simulator.git}.
- Towards understanding sharpness-aware minimization. In International Conference on Machine Learning, pp. 639–668. PMLR, 2022.
- Fedopt: Towards communication efficiency and privacy preservation in federated learning. Applied Sciences, 10(8):2864, 2020.
- On second-order optimization methods for federated learning. arXiv preprint arXiv:2109.02388, 2021.
- Improving generalization in federated learning by seeking flat minima. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIII, pp. 654–672. Springer, 2022.
- Convergence and accuracy trade-offs in federated learning and meta-learning. In International Conference on Artificial Intelligence and Statistics, pp. 2575–2583. PMLR, 2021.
- Quantized adam with error feedback. ACM Transactions on Intelligent Systems and Technology (TIST), 12(5):1–26, 2021.
- Efficient-adam: Communication-efficient distributed adam with complexity analysis. arXiv preprint arXiv:2205.14473, 2022.
- Fedbe: Making bayesian model ensemble applicable to federated learning. arXiv preprint arXiv:2009.01974, 2020.
- Toward communication efficient adaptive gradient method. In Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference, pp. 119–128, 2020.
- Federated learning based on dynamic regularization. In International Conference on Learning Representations, 2021.
- Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412, 2020a.
- Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412, 2020b.
- Feddc: Federated learning with non-iid data via local drift decoupling and correction. arXiv preprint arXiv:2203.11751, 2022.
- Fedadmm: A robust federated deep learning framework with adaptivity to system heterogeneity. arXiv preprint arXiv:2204.03529, 2022.
- Federated learning with compression: Unified analysis and sharp guarantees. In International Conference on Artificial Intelligence and Statistics, pp. 2350–2358. PMLR, 2021.
- Federated learning of a mixture of global and local models. arXiv preprint arXiv:2002.05516, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- The non-iid data quagmire of decentralized machine learning. In International Conference on Machine Learning, pp. 4387–4398. PMLR, 2020.
- Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335, 2019.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- Fed-lamb: Layerwise and dimensionwise locally adaptive optimization algorithm. CoRR, abs/2110.00532, 2021.
- Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pp. 5132–5143. PMLR, 2020.
- First analysis of local gd on heterogeneous data. arXiv preprint arXiv:1909.04715, 2019.
- Communication-efficient federated learning with acceleration of global momentum. arXiv preprint arXiv:2201.03172, 2022.
- Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.
- Learning multiple layers of features from tiny images. 2009.
- A review of applications in federated learning. Computers & Industrial Engineering, 149:106854, 2020a.
- Feddane: A federated newton-type method. In 2019 53rd Asilomar Conference on Signals, Systems, and Computers, pp. 1227–1231. IEEE, 2019.
- Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60, 2020b.
- Variance reduced local sgd with lower communication complexity. arXiv preprint arXiv:1912.12844, 2019.
- From distributed machine learning to federated learning: A survey. Knowledge and Information Systems, pp. 1–33, 2022.
- Enhance local consistency in federated learning: A multi-step inertial momentum approach. arXiv preprint arXiv:2302.05726, 2023.
- From local sgd to local fixed-point methods for federated learning. In International Conference on Machine Learning, pp. 6692–6701. PMLR, 2020.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282. PMLR, 2017.
- Make sharpness-aware minimization stronger: A sparsified perturbation approach. In Advances in Neural Information Processing Systems.
- Fedadc: Accelerated federated learning with drift control. In 2021 IEEE International Symposium on Information Theory (ISIT), pp. 467–472. IEEE, 2021.
- Generalized federated learning via sharpness aware minimization. In International Conference on Machine Learning, pp. 18250–18280. PMLR, 2022.
- Adaptive federated optimization. arXiv preprint arXiv:2003.00295, 2020.
- On the convergence of federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127, 3:3, 2018.
- Improving the model consistency of decentralized federated learning. arXiv preprint arXiv:2302.04083, 2023.
- Fedproto: Federated prototype learning across heterogeneous clients. In AAAI Conference on Artificial Intelligence, volume 1, 2022.
- Dirk van der Hoeven. Exploiting the surrogate gap in online multiclass classification. Advances in Neural Information Processing Systems, 33:9562–9572, 2020.
- Fedadmm: A federated primal-dual algorithm allowing partial participation. arXiv preprint arXiv:2203.15104, 2022.
- Slowmo: Improving communication-efficient distributed sgd with slow momentum. arXiv preprint arXiv:1910.00643, 2019.
- Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020a.
- Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020b.
- Local adaptivity in federated learning: Convergence and consistency. arXiv preprint arXiv:2106.02305, 2021.
- Group normalization. In Proceedings of the European conference on computer vision (ECCV), pp. 3–19, 2018.
- Fedcm: Federated learning with client-level momentum. arXiv preprint arXiv:2106.10874, 2021.
- Achieving linear speedup with partial worker participation in non-iid federated learning. arXiv preprint arXiv:2101.11203, 2021.
- Over-the-air federated learning via second-order optimization. arXiv preprint arXiv:2203.15488, 2022.
- Federated learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 13(3):1–207, 2019.
- On the linear speedup analysis of communication efficient momentum sgd for distributed non-convex optimization. In International Conference on Machine Learning, pp. 7184–7193. PMLR, 2019.
- Fedpd: A federated learning framework with adaptivity to non-iid data. IEEE Transactions on Signal Processing, 69:6055–6070, 2021.
- Penalizing gradient norm for efficiently improving generalization in deep learning. arXiv preprint arXiv:2202.03599, 2022.
- Improving sharpness-aware minimization with fisher mask for better generalization on language models. arXiv preprint arXiv:2210.05497, 2022.
- Yan Sun (309 papers)
- Li Shen (363 papers)
- Tiansheng Huang (30 papers)
- Liang Ding (159 papers)
- Dacheng Tao (830 papers)