Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning (2310.07807v1)

Published 11 Oct 2023 in cs.LG

Abstract: Federated learning (FL) is a decentralized machine learning approach where independent learners process data privately. Its goal is to create a robust and accurate model by aggregating and retraining local models over multiple rounds. However, FL faces challenges regarding data heterogeneity and model aggregation effectiveness. In order to simulate real-world data, researchers use methods for data partitioning that transform a dataset designated for centralized learning into a group of sub-datasets suitable for distributed machine learning with different data heterogeneity. In this paper, we study the currently popular data partitioning techniques and visualize their main disadvantages: the lack of precision in the data diversity, which leads to unreliable heterogeneity indexes, and the inability to incrementally challenge the FL algorithms. To resolve this problem, we propose a method that leverages entropy and symmetry to construct 'the most challenging' and controllable data distributions with gradual difficulty. We introduce a metric to measure data heterogeneity among the learning agents and a transformation technique that divides any dataset into splits with precise data diversity. Through a comparative study, we demonstrate the superiority of our method over existing FL data partitioning approaches, showcasing its potential to challenge model aggregation algorithms. Experimental results indicate that our approach gradually challenges the FL strategies, and the models trained on FedSym distributions are more distinct.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Paul Voigt and Axel Von dem Bussche. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing, 10(3152676):10–5555, 2017.
  2. Federated learning of deep networks using model averaging. CoRR, abs/1602.05629, 2016.
  3. Federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127, 2018.
  4. Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pages 5132–5143. PMLR, 2020.
  5. Robust aggregation for federated learning. arXiv preprint arXiv:1912.13445, 2019.
  6. Adaptive federated optimization. arXiv preprint arXiv:2003.00295, 2020.
  7. Multi-center federated learning. arXiv preprint arXiv:2005.01026, 2020.
  8. Federated learning with matched averaging. arXiv preprint arXiv:2002.06440, 2020.
  9. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems, 33:2351–2363, 2020.
  10. A survey on security and privacy of federated learning. Future Generation Computer Systems, 115:619–640, 2021.
  11. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research, 5:1–19, 2021.
  12. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.
  13. Personalized cross-silo federated learning on non-iid data. In AAAI, pages 7865–7873, 2021.
  14. Federated learning on non-iid data: A survey. Neurocomputing, 465:371–390, 2021.
  15. Federated learning on non-iid data silos: An experimental study. CoRR, abs/2102.02079, 2021.
  16. No fear of heterogeneity: Classifier calibration for federated learning with non-iid data. CoRR, abs/2106.05001, 2021.
  17. On the unreasonable effectiveness of federated averaging with heterogeneous data. arXiv preprint arXiv:2206.04723, 2022.
  18. Rethinking data heterogeneity in federated learning: Introducing a new notion and standard benchmarks. In Workshop on Federated Learning: Recent Advances and New Challenges (in Conjunction with NeurIPS 2022).
  19. Sriram Vajapeyam. Understanding shannon’s entropy metric for information. arXiv preprint arXiv:1405.2061, 2014.
  20. Anne E Magurran. Measuring Biological Diversity. John Wiley & Sons, 2003.
  21. Richard M Golden. Statistical machine learning: A unified framework. Chapman and Hall/CRC, 2020.
  22. Communication-Efficient Learning of Deep Networks from Decentralized Data. In AISTATS, pages 1273–1282, 2017.
  23. Federated learning with non-iid data. arXiv preprint arXiv:1806.00582, 2018.
  24. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977, 2019.
  25. Lotteryfl: Personalized and communication-efficient federated learning with lottery ticket hypothesis on non-iid datasets. arXiv preprint arXiv:2008.03371, 2020.
  26. Rethinking architecture design for tackling data heterogeneity in federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10061–10071, 2022.
  27. Federated learning with label distribution skew via logits calibration. In International Conference on Machine Learning, pages 26311–26329. PMLR, 2022.
  28. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557, 2017.
  29. Bayesian nonparametric federated learning of neural networks. In International Conference on Machine Learning, pages 7252–7261. PMLR, 2019.
  30. Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335, 2019.
  31. Practical one-shot federated learning for cross-silo setting. arXiv preprint arXiv:2010.01017, 2020.
  32. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020.
  33. Federated learning based on dynamic regularization. arXiv preprint arXiv:2111.04263, 2021.
  34. Contra: Defending against poisoning attacks in federated learning. In Computer Security–ESORICS 2021: 26th European Symposium on Research in Computer Security, Darmstadt, Germany, October 4–8, 2021, Proceedings, Part I 26, pages 455–475. Springer, 2021.
  35. Clustered sampling: Low-variance and improved representativity for clients selection in federated learning. In International Conference on Machine Learning, pages 3407–3416. PMLR, 2021.
  36. Feddc: Federated learning with non-iid data via local drift decoupling and correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10112–10121, 2022.
  37. Energy efficient user scheduling for hybrid split and federated learning in wireless uav networks. In ICC 2022-IEEE International Conference on Communications, pages 1–6. IEEE, 2022.
  38. Fedrad: Federated robust adaptive distillation. arXiv preprint arXiv:2112.01405, 2021.
  39. The discrete gaussian for differential privacy. Advances in Neural Information Processing Systems, 33:15676–15688, 2020.
  40. David Andrich. The rasch distribution: A discrete, general form of the gauss distribution of uncertainty in scientific measurement. Measurement, 173:108672, 2021.
  41. William J Stewart. Probability, Markov chains, queues, and simulation: the mathematical basis of performance modeling. Princeton university press, 2009.
  42. Keith Conrad. Probability distributions and maximum entropy. Entropy, 6(452):10, 2004.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Ensiye Kiyamousavi (1 paper)
  2. Boris Kraychev (1 paper)
  3. Ivan Koychev (33 papers)

Summary

We haven't generated a summary for this paper yet.