Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy (2302.10429v2)

Published 21 Feb 2023 in cs.LG, cs.DC, and math.OC

Abstract: Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections. Its performance suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting. In this paper, we propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems. Concretely, FedSpeed applies the prox-correction term on the current local updates to efficiently reduce the biases introduced by the prox-term, a necessary regularizer to maintain the strong local consistency. Furthermore, FedSpeed merges the vanilla stochastic gradient with a perturbation computed from an extra gradient ascent step in the neighborhood, thereby alleviating the issue of local over-fitting. Our theoretical analysis indicates that the convergence rate is related to both the communication rounds $T$ and local intervals $K$ with a upper bound $\small \mathcal{O}(1/T)$ if setting a proper local interval. Moreover, we conduct extensive experiments on the real-world dataset to demonstrate the efficiency of our proposed FedSpeed, which performs significantly faster and achieves the state-of-the-art (SOTA) performance on the general FL experimental settings than several baselines. Our code is available at \url{https://github.com/woodenchild95/FL-Simulator.git}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Towards understanding sharpness-aware minimization. In International Conference on Machine Learning, pp. 639–668. PMLR, 2022.
  2. Fedopt: Towards communication efficiency and privacy preservation in federated learning. Applied Sciences, 10(8):2864, 2020.
  3. On second-order optimization methods for federated learning. arXiv preprint arXiv:2109.02388, 2021.
  4. Improving generalization in federated learning by seeking flat minima. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIII, pp.  654–672. Springer, 2022.
  5. Convergence and accuracy trade-offs in federated learning and meta-learning. In International Conference on Artificial Intelligence and Statistics, pp.  2575–2583. PMLR, 2021.
  6. Quantized adam with error feedback. ACM Transactions on Intelligent Systems and Technology (TIST), 12(5):1–26, 2021.
  7. Efficient-adam: Communication-efficient distributed adam with complexity analysis. arXiv preprint arXiv:2205.14473, 2022.
  8. Fedbe: Making bayesian model ensemble applicable to federated learning. arXiv preprint arXiv:2009.01974, 2020.
  9. Toward communication efficient adaptive gradient method. In Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference, pp.  119–128, 2020.
  10. Federated learning based on dynamic regularization. In International Conference on Learning Representations, 2021.
  11. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412, 2020a.
  12. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412, 2020b.
  13. Feddc: Federated learning with non-iid data via local drift decoupling and correction. arXiv preprint arXiv:2203.11751, 2022.
  14. Fedadmm: A robust federated deep learning framework with adaptivity to system heterogeneity. arXiv preprint arXiv:2204.03529, 2022.
  15. Federated learning with compression: Unified analysis and sharp guarantees. In International Conference on Artificial Intelligence and Statistics, pp.  2350–2358. PMLR, 2021.
  16. Federated learning of a mixture of global and local models. arXiv preprint arXiv:2002.05516, 2020.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  18. The non-iid data quagmire of decentralized machine learning. In International Conference on Machine Learning, pp. 4387–4398. PMLR, 2020.
  19. Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335, 2019.
  20. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
  21. Fed-lamb: Layerwise and dimensionwise locally adaptive optimization algorithm. CoRR, abs/2110.00532, 2021.
  22. Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pp. 5132–5143. PMLR, 2020.
  23. First analysis of local gd on heterogeneous data. arXiv preprint arXiv:1909.04715, 2019.
  24. Communication-efficient federated learning with acceleration of global momentum. arXiv preprint arXiv:2201.03172, 2022.
  25. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.
  26. Learning multiple layers of features from tiny images. 2009.
  27. A review of applications in federated learning. Computers & Industrial Engineering, 149:106854, 2020a.
  28. Feddane: A federated newton-type method. In 2019 53rd Asilomar Conference on Signals, Systems, and Computers, pp.  1227–1231. IEEE, 2019.
  29. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60, 2020b.
  30. Variance reduced local sgd with lower communication complexity. arXiv preprint arXiv:1912.12844, 2019.
  31. From distributed machine learning to federated learning: A survey. Knowledge and Information Systems, pp.  1–33, 2022.
  32. Enhance local consistency in federated learning: A multi-step inertial momentum approach. arXiv preprint arXiv:2302.05726, 2023.
  33. From local sgd to local fixed-point methods for federated learning. In International Conference on Machine Learning, pp. 6692–6701. PMLR, 2020.
  34. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp.  1273–1282. PMLR, 2017.
  35. Make sharpness-aware minimization stronger: A sparsified perturbation approach. In Advances in Neural Information Processing Systems.
  36. Fedadc: Accelerated federated learning with drift control. In 2021 IEEE International Symposium on Information Theory (ISIT), pp.  467–472. IEEE, 2021.
  37. Generalized federated learning via sharpness aware minimization. In International Conference on Machine Learning, pp. 18250–18280. PMLR, 2022.
  38. Adaptive federated optimization. arXiv preprint arXiv:2003.00295, 2020.
  39. On the convergence of federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127, 3:3, 2018.
  40. Improving the model consistency of decentralized federated learning. arXiv preprint arXiv:2302.04083, 2023.
  41. Fedproto: Federated prototype learning across heterogeneous clients. In AAAI Conference on Artificial Intelligence, volume 1, 2022.
  42. Dirk van der Hoeven. Exploiting the surrogate gap in online multiclass classification. Advances in Neural Information Processing Systems, 33:9562–9572, 2020.
  43. Fedadmm: A federated primal-dual algorithm allowing partial participation. arXiv preprint arXiv:2203.15104, 2022.
  44. Slowmo: Improving communication-efficient distributed sgd with slow momentum. arXiv preprint arXiv:1910.00643, 2019.
  45. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020a.
  46. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020b.
  47. Local adaptivity in federated learning: Convergence and consistency. arXiv preprint arXiv:2106.02305, 2021.
  48. Group normalization. In Proceedings of the European conference on computer vision (ECCV), pp.  3–19, 2018.
  49. Fedcm: Federated learning with client-level momentum. arXiv preprint arXiv:2106.10874, 2021.
  50. Achieving linear speedup with partial worker participation in non-iid federated learning. arXiv preprint arXiv:2101.11203, 2021.
  51. Over-the-air federated learning via second-order optimization. arXiv preprint arXiv:2203.15488, 2022.
  52. Federated learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 13(3):1–207, 2019.
  53. On the linear speedup analysis of communication efficient momentum sgd for distributed non-convex optimization. In International Conference on Machine Learning, pp. 7184–7193. PMLR, 2019.
  54. Fedpd: A federated learning framework with adaptivity to non-iid data. IEEE Transactions on Signal Processing, 69:6055–6070, 2021.
  55. Penalizing gradient norm for efficiently improving generalization in deep learning. arXiv preprint arXiv:2202.03599, 2022.
  56. Improving sharpness-aware minimization with fisher mask for better generalization on language models. arXiv preprint arXiv:2210.05497, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yan Sun (309 papers)
  2. Li Shen (363 papers)
  3. Tiansheng Huang (30 papers)
  4. Liang Ding (159 papers)
  5. Dacheng Tao (830 papers)
Citations (41)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com