Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Federated Two Stage Decoupling With Adaptive Personalization Layers (2308.15821v2)

Published 30 Aug 2023 in cs.LG and cs.AI

Abstract: Federated learning has gained significant attention due to its groundbreaking ability to enable distributed learning while maintaining privacy constraints. However, as a consequence of data heterogeneity among decentralized devices, it inherently experiences significant learning degradation and slow convergence speed. Therefore, it is natural to employ the concept of clustering homogeneous clients into the same group, allowing only the model weights within each group to be aggregated. While most existing clustered federated learning methods employ either model gradients or inference outputs as metrics for client partitioning, with the goal of grouping similar devices together, may still have heterogeneity within each cluster. Moreover, there is a scarcity of research exploring the underlying reasons for determining the appropriate timing for clustering, resulting in the common practice of assigning each client to its own individual cluster, particularly in the context of highly non independent and identically distributed (Non-IID) data. In this paper, we introduce a two-stage decoupling federated learning algorithm with adaptive personalization layers named FedTSDP, where client clustering is performed twice according to inference outputs and model weights, respectively. Hopkins amended sampling is adopted to determine the appropriate timing for clustering and the sampling weight of public unlabeled data. In addition, a simple yet effective approach is developed to adaptively adjust the personalization layers based on varying degrees of data skew. Experimental results show that our proposed method has reliable performance on both IID and non-IID scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. IEEE Transactions on Communications 71(6), 3333–3351 (2023). DOI 10.1109/TCOMM.2023.3253718
  2. arXiv preprint arXiv:1912.00818 (2019)
  3. In: 2004 IEEE International conference on fuzzy systems (IEEE Cat. No. 04CH37542), vol. 1, pp. 149–153. IEEE (2004)
  4. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2020)
  5. IEEE Journal of Selected Topics in Signal Processing 17(1), 234–247 (2023)
  6. In: International conference on machine learning, pp. 2089–2099. PMLR (2021)
  7. Csiszár, I.: I-divergence geometry of probability distributions and minimization problems. The annals of probability pp. 146–158 (1975)
  8. IEEE Transactions on Parallel and Distributed Systems 33(11), 2661–2674 (2021)
  9. In: kdd, vol. 96, pp. 226–231 (1996)
  10. In: Database Systems for Advanced Applications: 26th International Conference, DASFAA 2021, Taipei, Taiwan, April 11–14, 2021, Proceedings, Part I 26, pp. 37–52. Springer (2021)
  11. Advances in Neural Information Processing Systems 33, 19586–19597 (2020)
  12. In: F. Bach, D. Blei (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 37, pp. 448–456. PMLR, Lille, France (2015). URL https://proceedings.mlr.press/v37/ioffe15.html
  13. Master’s thesis, Department of Computer Science, University of Toronto (2009)
  14. Complex & Intelligent Systems pp. 1–21 (2023)
  15. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 965–978 (2022). DOI 10.1109/ICDE53745.2022.00077
  16. In: International Conference on Machine Learning, pp. 6357–6368. PMLR (2021)
  17. Proceedings of Machine learning and systems 2, 429–450 (2020)
  18. In: International Conference on Learning Representations (2021). URL https://openreview.net/forum?id=6YEQUn0QICG
  19. In: 2020 IEEE International Conference on Joint Cloud Computing, pp. 22–29 (2020). DOI 10.1109/JCC49151.2020.00013
  20. IEEE Transactions on Intelligent Transportation Systems 23(7), 8423–8434 (2022). DOI 10.1109/TITS.2021.3081560
  21. arXiv preprint arXiv:2001.01523 (2020)
  22. World Wide Web 26(1), 481–500 (2023)
  23. Complex & Intelligent Systems 9(2), 2081–2099 (2023)
  24. IEEE Transactions on Parallel and Distributed Systems 34(4), 1145–1158 (2023). DOI 10.1109/TPDS.2023.3240767
  25. IEEE Transactions on Parallel and Distributed Systems 34(4), 1145–1158 (2023)
  26. In: M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, J.W. Vaughan (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 5972–5984. Curran Associates, Inc. (2021)
  27. Future Generation Computer Systems 135, 244–258 (2022). DOI https://doi.org/10.1016/j.future.2022.05.003. URL https://www.sciencedirect.com/science/article/pii/S0167739X22001686
  28. In: A. Singh, J. Zhu (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 54, pp. 1273–1282. PMLR (2017). URL https://proceedings.mlr.press/v54/mcmahan17a.html
  29. Journal of the Franklin Institute 334(2), 307–318 (1997)
  30. IEEE Open Journal of the Computer Society 4, 109–120 (2023)
  31. Nielsen, F.: On a variational definition for the jensen-shannon symmetrization of distances based on the information radius. Entropy 23(4), 464 (2021)
  32. In: K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, S. Sabato (eds.) Proceedings of the 39th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 162, pp. 17716–17758. PMLR (2022). URL https://proceedings.mlr.press/v162/pillutla22a.html
  33. Complex & Intelligent Systems 8(4), 3121–3129 (2022). DOI 10.1007/s40747-021-00474-y. URL https://doi.org/10.1007/s40747-021-00474-y
  34. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 8124–8131 (2022)
  35. IEEE transactions on neural networks and learning systems 32(8), 3710–3722 (2020)
  36. IEEE Transactions on Neural Networks and Learning Systems 31(9), 3400–3413 (2020). DOI 10.1109/TNNLS.2019.2944481
  37. In: I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
  38. Computers & Security 108, 102344 (2021). DOI https://doi.org/10.1016/j.cose.2021.102344. URL https://www.sciencedirect.com/science/article/pii/S0167404821001681
  39. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5020–5027. IEEE Computer Society, Los Alamitos, CA, USA (2021). DOI 10.1109/ICPR48806.2021.9412599. URL https://doi.ieeecomputersociety.org/10.1109/ICPR48806.2021.9412599
  40. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, pp. 1698–1707 (2020). DOI 10.1109/INFOCOM41043.2020.9155494
  41. Journal of Big data 3(1), 1–40 (2016)
  42. World Wide Web pp. 1–26 (2023)
  43. In: International Conference on Database Systems for Advanced Applications, pp. 677–692. Springer (2023)
  44. arXiv preprint arXiv:1806.00582 (2018)
  45. Complex & Intelligent Systems 9(2), 1995–2017 (2023)
  46. IEEE Transactions on Parallel and Distributed Systems 33(1), 192–205 (2022). DOI 10.1109/TPDS.2021.3090331
  47. Neurocomputing 465, 371–390 (2021). DOI https://doi.org/10.1016/j.neucom.2021.07.098. URL https://www.sciencedirect.com/science/article/pii/S0925231221013254
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Hangyu Zhu (12 papers)
  2. Yuxiang Fan (7 papers)
  3. Zhenping Xie (8 papers)

Summary

We haven't generated a summary for this paper yet.