Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Federated Learning via New Entropy Approach (2303.14966v3)

Published 27 Mar 2023 in cs.DC and cs.LG

Abstract: Federated Learning (FL) has emerged as a prominent distributed machine learning framework that enables geographically discrete clients to train a global model collaboratively while preserving their privacy-sensitive data. However, due to the non-independent-and-identically-distributed (Non-IID) data generated by heterogeneous clients, the performances of the conventional federated optimization schemes such as FedAvg and its variants deteriorate, requiring the design to adaptively adjust specific model parameters to alleviate the negative influence of heterogeneity. In this paper, by leveraging entropy as a new metric for assessing the degree of system disorder, we propose an adaptive FEDerated learning algorithm based on ENTropy theory (FedEnt) to alleviate the parameter deviation among heterogeneous clients and achieve fast convergence. Nevertheless, given the data disparity and parameter deviation of heterogeneous clients, determining the optimal dynamic learning rate for each client becomes a challenging task as there is no communication among participating clients during the local training epochs. To enable a decentralized learning rate for each participating client, we first introduce the mean-field terms to estimate the components associated with other clients' local parameters. Furthermore, we provide rigorous theoretical analysis on the existence and determination of the mean-field estimators. Based on the mean-field estimators, the closed-form adaptive learning rate for each client is derived by constructing the Hamilton equation. Moreover, the convergence rate of our proposed FedEnt is proved. The extensive experimental results on the real-world datasets (i.e., MNIST, EMNIST-L, CIFAR10, and CIFAR100) show that our FedEnt algorithm surpasses FedAvg and its variants (i.e., FedAdam, FedProx, and FedDyn) under Non-IID settings and achieves a faster convergence rate.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, “An efficient framework for clustered federated learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 19 586–19 597, 2020.
  2. J. Xu, B. S. Glicksberg, C. Su, P. Walker, J. Bian, and F. Wang, “Federated learning for healthcare informatics,” Journal of Healthcare Informatics Research, vol. 5, pp. 1–19, 2021.
  3. Q. Xia, W. Ye, Z. Tao, J. Wu, and Q. Li, “A survey of federated learning for edge computing: Research problems and solutions,” High-Confidence Computing, vol. 1, no. 1, p. 100008, 2021.
  4. K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 1175–1191.
  5. Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
  6. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, A. Singh and J. Zhu, Eds., vol. 54, 20–22 Apr 2017, pp. 1273–1282.
  7. N. H. Tran, W. Bao, A. Zomaya, M. N. Nguyen, and C. S. Hong, “Federated learning over wireless networks: Optimization model design and analysis,” in IEEE INFOCOM 2019-IEEE conference on computer communications.   IEEE, 2019, pp. 1387–1395.
  8. A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,” arXiv preprint arXiv:1811.03604, 2018.
  9. Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with non-iid data,” arXiv preprint arXiv:1806.00582, 2018.
  10. T.-M. H. Hsu, H. Qi, and M. Brown, “Measuring the effects of non-identical data distribution for federated visual classification,” arXiv preprint arXiv:1909.06335, 2019.
  11. S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” in International Conference on Machine Learning.   PMLR, 2020, pp. 5132–5143.
  12. F. Sattler, S. Wiedemann, K.-R. Müller, and W. Samek, “Robust and communication-efficient federated learning from non-iid data,” IEEE transactions on neural networks and learning systems, vol. 31, no. 9, pp. 3400–3413, 2019.
  13. S. J. Reddi, Z. Charles, M. Zaheer, Z. Garrett, K. Rush, J. Konečnỳ, S. Kumar, and H. B. McMahan, “Adaptive federated optimization,” International Conference on Learning Representations, 2020.
  14. L. Fu, H. Zhang, G. Gao, M. Zhang, and X. Liu, “Client selection in federated learning: Principles, challenges, and opportunities,” IEEE Internet of Things Journal, 2023.
  15. T. Zhang, L. Gao, S. Lee, M. Zhang, and S. Avestimehr, “Timelyfl: Heterogeneity-aware asynchronous federated learning with adaptive partial training,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5063–5072.
  16. J. Liu, J. H. Wang, C. Rong, Y. Xu, T. Yu, and J. Wang, “Fedpa: An adaptively partial model aggregation strategy in federated learning,” Computer Networks, vol. 199, p. 108468, 2021.
  17. L. Liu, J. Zhang, S. Song, and K. B. Letaief, “Client-edge-cloud hierarchical federated learning,” in ICC 2020-2020 IEEE International Conference on Communications (ICC).   IEEE, 2020, pp. 1–6.
  18. W. Sun, S. Lei, L. Wang, Z. Liu, and Y. Zhang, “Adaptive federated learning and digital twin for industrial internet of things,” IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5605–5614, 2020.
  19. H. Wu and P. Wang, “Fast-convergent federated learning with adaptive weighting,” IEEE Transactions on Cognitive Communications and Networking, vol. 7, no. 4, pp. 1078–1088, 2021.
  20. J. Ren, G. Yu, and G. Ding, “Accelerating dnn training in wireless federated edge learning systems,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 219–232, 2020.
  21. J. Zhang, S. Guo, Z. Qu, D. Zeng, Y. Zhan, Q. Liu, and R. Akerkar, “Adaptive federated learning on non-iid data with resource constraint,” IEEE Transactions on Computers, vol. 71, no. 7, pp. 1655–1667, 2021.
  22. B. Xu, W. Xia, W. Wen, P. Liu, H. Zhao, and H. Zhu, “Adaptive hierarchical federated learning over wireless networks,” IEEE Transactions on Vehicular Technology, vol. 71, no. 2, pp. 2070–2083, 2021.
  23. S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive federated learning in resource constrained edge computing systems,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 6, pp. 1205–1221, 2019.
  24. T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated optimization in heterogeneous networks,” Proceedings of Machine Learning and Systems, vol. 2, pp. 429–450, 2020.
  25. H. T. Nguyen, V. Sehwag, S. Hosseinalipour, C. G. Brinton, M. Chiang, and H. V. Poor, “Fast-convergent federated learning,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 201–218, 2020.
  26. Y. Chen, X. Sun, and Y. Jin, “Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation,” IEEE transactions on neural networks and learning systems, vol. 31, no. 10, pp. 4229–4238, 2019.
  27. T. Nishio and R. Yonetani, “Client selection for federated learning with heterogeneous resources in mobile edge,” in ICC 2019-2019 IEEE international conference on communications (ICC).   IEEE, 2019, pp. 1–7.
  28. Y. J. Cho, J. Wang, and G. Joshi, “Towards understanding biased client selection in federated learning,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2022, pp. 10 351–10 375.
  29. C. Li, X. Zeng, M. Zhang, and Z. Cao, “Pyramidfl: A fine-grained client selection framework for efficient federated learning,” in Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, 2022, pp. 158–171.
  30. J. Huang, C. Hong, Y. Liu, L. Y. Chen, and S. Roos, “Maverick matters: Client contribution and selection in federated learning,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining.   Springer, 2023, pp. 269–282.
  31. J. Liu, J. Wu, J. Chen, M. Hu, Y. Zhou, and D. Wu, “Feddwa: Personalized federated learning with dynamic weight adjustment,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023.
  32. W. Zhang, D. Yang, W. Wu, H. Peng, N. Zhang, H. Zhang, and X. Shen, “Optimizing federated learning in distributed industrial iot: A multi-agent approach,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3688–3703, 2021.
  33. D. Yang, W. Zhang, Q. Ye, C. Zhang, N. Zhang, C. Huang, H. Zhang, and X. Shen, “Detfed: Dynamic resource scheduling for deterministic federated learning over time-sensitive networks,” IEEE Transactions on Mobile Computing, 2023.
  34. W. Zhang, D. Yang, H. Peng, W. Wu, W. Quan, H. Zhang, and X. Shen, “Deep reinforcement learning based resource management for dnn inference in industrial iot,” IEEE Transactions on Vehicular Technology, vol. 70, no. 8, pp. 7605–7618, 2021.
  35. C. Xie, S. Koyejo, and I. Gupta, “Asynchronous federated optimization,” NeurlPs 12th Annual Workshop on Optimization for Machine Learning, 2020.
  36. Q. Ma, Y. Xu, H. Xu, Z. Jiang, L. Huang, and H. Huang, “Fedsa: A semi-asynchronous federated learning mechanism in heterogeneous edge computing,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3654–3672, 2021.
  37. Y. Xu, Z. Ma, H. Xu, S. Chen, J. Liu, and Y. Xue, “Fedlc: Accelerating asynchronous federated learning in edge computing,” IEEE Transactions on Mobile Computing, 2023.
  38. J. Nguyen, K. Malik, H. Zhan, A. Yousefpour, M. Rabbat, M. Malek, and D. Huba, “Federated learning with buffered asynchronous aggregation,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2022, pp. 3581–3607.
  39. M. P. Uddin, Y. Xiang, B. Cai, X. Lu, J. Yearwood, and L. Gao, “Arfl: Adaptive and robust federated learning,” IEEE Transactions on Mobile Computing, 2023.
  40. Z. He, L. Wang, and Z. Cai, “Clustered federated learning with adaptive local differential privacy on heterogeneous iot data,” IEEE Internet of Things Journal, 2023.
  41. R. Ward, X. Wu, and L. Bottou, “Adagrad stepsizes: Sharp convergence over nonconvex landscapes,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 9047–9076, 2020.
  42. S. Agrawal, S. Sarkar, M. Alazab, P. K. R. Maddikunta, T. R. Gadekallu, and Q.-V. Pham, “Genetic cfl: hyperparameter optimization in clustered federated learning,” Computational Intelligence and Neuroscience, vol. 2021, 2021.
  43. L. M. Silva, J. M. de Sá, and L. A. Alexandre, “Neural network classification using shannon’s entropy.” in ESANN, 2005, pp. 217–222.
  44. J. M. Santos, J. M. de Sá, L. A. Alexandre, and F. Sereno, “Optimization of the error entropy minimization algorithm for neural network classification,” Intelligent engineering systems through artificial neural networks, vol. 14, pp. 81–86, 2004.
  45. Y. Zhang and A. C. Barato, “Critical behavior of entropy production and learning rate: Ising model with an oscillating field,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2016, no. 11, p. 113207, 2016.
  46. Z. Ahmed, N. Le Roux, M. Norouzi, and D. Schuurmans, “Understanding the impact of entropy on policy optimization,” in International conference on machine learning.   PMLR, 2019, pp. 151–160.
  47. F. Sattler, K.-R. Müller, and W. Samek, “Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,” IEEE transactions on neural networks and learning systems, vol. 32, no. 8, pp. 3710–3722, 2020.
  48. D. L. Di Ruscio, “Discrete lq optimal control with integral action: A simple controller on incremental form for mimo systems.” Norwegian Society of Automatic Control, 2012.
  49. F. Grandoni, R. Ravi, M. Singh, and R. Zenklusen, “New approaches to multi-objective optimization,” Mathematical Programming, vol. 146, pp. 525–554, 2014.
  50. F. Grandoni, R. Ravi, and M. Singh, “Iterative rounding for multi-objective optimization problems,” in European Symposium on Algorithms.   Springer, 2009, pp. 95–106.
  51. C. Yang, “An adaptive sensor placement algorithm for structural health monitoring based on multi-objective iterative optimization using weight factor updating,” Mechanical Systems and Signal Processing, vol. 151, p. 107363, 2021.
  52. K. Tu, S. Zheng, X. Wang, and X. Hu, “Adaptive federated learning via mean field approach,” in 2022 IEEE International Conferences on Green Computing & Communications (GreenCom).   IEEE, 2022, pp. 168–175.
  53. A. Rakhlin, O. Shamir, and K. Sridharan, “Making gradient descent optimal for strongly convex stochastic optimization,” arXiv preprint arXiv:1109.5647, 2011.
  54. S. U. Stich, “Local sgd converges fast and communicates little,” arXiv preprint arXiv:1805.09767, 2018.
  55. A. Spiridonoff, A. Olshevsky, and Y. Paschalidis, “Communication-efficient sgd: From local sgd to one-shot averaging,” Advances in Neural Information Processing Systems, vol. 34, pp. 24 313–24 326, 2021.
  56. D. A. E. Acar, Y. Zhao, R. M. Navarro, M. Mattina, P. N. Whatmough, and V. Saligrama, “Federated learning based on dynamic regularization,” arXiv preprint arXiv:2111.04263, 2021.
  57. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
  58. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” Proceedings of the IEEE, 2009.
  59. G. Cohen, S. Afshar, J. Tapson, and A. Van Schaik, “Emnist: Extending mnist to handwritten letters,” in 2017 international joint conference on neural networks (IJCNN).   IEEE, 2017, pp. 2921–2926.
  60. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Shensheng Zheng (2 papers)
  2. Wenhao Yuan (8 papers)
  3. Xuehe Wang (11 papers)
  4. Lingjie Duan (89 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets