Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Speed Up Federated Learning in Heterogeneous Environment: A Dynamic Tiering Approach (2312.05642v1)

Published 9 Dec 2023 in cs.LG, cs.AI, cs.MA, and cs.PF

Abstract: Federated learning (FL) enables collaboratively training a model while keeping the training data decentralized and private. However, one significant impediment to training a model using FL, especially large models, is the resource constraints of devices with heterogeneous computation and communication capacities as well as varying task sizes. Such heterogeneity would render significant variations in the training time of clients, resulting in a longer overall training time as well as a waste of resources in faster clients. To tackle these heterogeneity issues, we propose the Dynamic Tiering-based Federated Learning (DTFL) system where slower clients dynamically offload part of the model to the server to alleviate resource constraints and speed up training. By leveraging the concept of Split Learning, DTFL offloads different portions of the global model to clients in different tiers and enables each client to update the models in parallel via local-loss-based training. This helps reduce the computation and communication demand on resource-constrained devices and thus mitigates the straggler problem. DTFL introduces a dynamic tier scheduler that uses tier profiling to estimate the expected training time of each client, based on their historical training time, communication speed, and dataset size. The dynamic tier scheduler assigns clients to suitable tiers to minimize the overall training time in each round. We first theoretically prove the convergence properties of DTFL. We then train large models (ResNet-56 and ResNet-110) on popular image datasets (CIFAR-10, CIFAR-100, CINIC-10, and HAM10000) under both IID and non-IID systems. Extensive experimental results show that compared with state-of-the-art FL methods, DTFL can significantly reduce the training time while maintaining model accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp.  308–318, 2016.
  2. A comprehensive empirical study of heterogeneity in federated learning. IEEE Internet of Things Journal, 2023.
  3. Decoupled greedy learning of cnns. In International Conference on Machine Learning, pp.  736–745. PMLR, 2020.
  4. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems, 1:374–388, 2019.
  5. Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097, 2018.
  6. Tifl: A tier-based federated learning system. In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing, pp.  125–136, 2020.
  7. Fedat: A high-performance and communication-efficient federated learning system with asynchronous tiers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp.  1–16, 2021.
  8. Communication-efficient and model-heterogeneous personalized federated learning via clustered knowledge transfer. IEEE Journal of Selected Topics in Signal Processing, 17(1):234–247, 2023.
  9. Cinic-10 is not imagenet or cifar-10. arXiv preprint arXiv:1810.03505, 2018.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  11. Splitguard: Detecting and mitigating training-hijacking attacks in split learning. In Proceedings of the 21st Workshop on Privacy in the Electronic Society, pp.  125–137, 2022.
  12. Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications, 116:1–8, 2018.
  13. Accelerating federated learning with split learning on locally generated losses. In ICML 2021 Workshop on Federated Learning for User Privacy and Data Confidentiality. ICML Board, 2021.
  14. Group knowledge transfer: Federated learning of large cnns at the edge. Advances in Neural Information Processing Systems, 33:14068–14080, 2020a.
  15. Fedml: A research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518, 2020b.
  16. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  17. Decoupled parallel backpropagation with convergence guarantee. In International Conference on Machine Learning, pp.  2098–2106. PMLR, 2018.
  18. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
  19. Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pp.  5132–5143. PMLR, 2020.
  20. Learning multiple layers of features from tiny images. 2009.
  21. Parallel training of deep networks with local updates. arXiv preprint arXiv:2012.03837, 2020.
  22. Certified robustness to adversarial examples with differential privacy. In 2019 IEEE Symposium on Security and Privacy (SP), pp.  656–672. IEEE, 2019.
  23. Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp.  965–978. IEEE, 2022.
  24. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429–450, 2020.
  25. On the convergence of fedavg on non-iid data. In International Conference on Learning Representations, 2019.
  26. Accelerating federated learning with data and model parallelism in edge computing. IEEE/ACM Transactions on Networking, 2023.
  27. Federated split bert for heterogeneous text classification. In 2022 International Joint Conference on Neural Networks (IJCNN), pp.  1–8. IEEE, 2022.
  28. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp.  1273–1282. PMLR, 2017.
  29. Training neural networks with local error signals. In International conference on machine learning, pp.  4839–4850. PMLR, 2019.
  30. vfedsec: Efficient secure aggregation for vertical federated learning via secure layer. arXiv preprint arXiv:2305.16794, 2023.
  31. Adaptive federated optimization. arXiv preprint arXiv:2003.00295, 2020.
  32. Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization. In International Conference on Artificial Intelligence and Statistics, pp.  2021–2031. PMLR, 2020.
  33. Straggler-resilient federated learning: Leveraging the interplay between statistical accuracy and system heterogeneity. IEEE Journal on Selected Areas in Information Theory, 3(2):197–205, 2022.
  34. Sajjadi Mohammadabadi, Seyed Mahmoud. Implementation. https://github.com/mahmoudsajjadi/DTFL, 2023.
  35. Secure aggregation for clustered federated learning. In 2023 IEEE International Symposium on Information Theory (ISIT), pp.  186–191. IEEE, 2023.
  36. Ringsfl: An adaptive split federated learning towards taming client heterogeneity. IEEE Transactions on Mobile Computing, 2023.
  37. Sebastian U Stich. Local sgd converges fast and communicates little. arXiv preprint arXiv:1805.09767, 2018.
  38. Splitfed: When federated learning meets split learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  8485–8493, 2022.
  39. Fedbert: When federated learning meets pre-training. ACM Transactions on Intelligent Systems and Technology (TIST), 13(4):1–26, 2022.
  40. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018.
  41. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564, 2018.
  42. Nopeek: Information leakage reduction to share activations in distributed deep learning. In 2020 International Conference on Data Mining Workshops (ICDMW), pp.  933–942. IEEE, 2020.
  43. Federated learning with matched averaging. In International Conference on Learning Representations, 2020a.
  44. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020b.
  45. Improving robustness to model inversion attacks via mutual information regularization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.  11666–11673, 2021.
  46. Split learning over wireless networks: Parallel design and resource management. IEEE Journal on Selected Areas in Communications, 41(4):1051–1066, 2023.
  47. Characterizing impacts of heterogeneity in federated learning upon large-scale smartphone data. In Proceedings of the Web Conference 2021, pp.  935–946, 2021.
  48. Privacy-preserving split learning via patch shuffling over transformers. In 2022 IEEE International Conference on Data Mining (ICDM), pp.  638–647. IEEE, 2022.
  49. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  16337–16346, 2021.
  50. Parallel restarted sgd with faster convergence and less communication: Demystifying why model averaging works for deep learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp.  5693–5700, 2019.
  51. Privacy and efficiency of communications in federated split learning. IEEE Transactions on Big Data, 2023.
  52. Deep leakage from gradients. Advances in neural information processing systems, 32, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Seyed Mahmoud Sajjadi Mohammadabadi (3 papers)
  2. Syed Zawad (12 papers)
  3. Feng Yan (67 papers)
  4. Lei Yang (372 papers)
Citations (5)