Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parallel Split Learning with Global Sampling (2407.15738v3)

Published 22 Jul 2024 in cs.LG, cs.AI, and cs.DC

Abstract: Distributed deep learning in resource-constrained environments faces scalability and generalization challenges due to large effective batch sizes and non-identically distributed client data. We introduce a server-driven sampling strategy that maintains a fixed global batch size by dynamically adjusting client-side batch sizes. This decouples the effective batch size from the number of participating devices and ensures that global batches better reflect the overall data distribution. Using standard concentration bounds, we establish tighter deviation guarantees compared to existing approaches. Empirical results on a benchmark dataset confirm that the proposed method improves model accuracy, training efficiency, and convergence stability, offering a scalable solution for learning at the network edge.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. M. Z. Alom, T. M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M. S. Nasrin, M. Hasan, B. C. Van Essen, A. A. S. Awwal, and V. K. Asari, “A State-of-the-Art Survey on Deep Learning Theory and Architectures,” Electronics, vol. 8, no. 3, p. 292, Mar. 2019.
  2. M. Kohankhaki, A. Ayad, M. Barhoush, B. Leibe, and A. Schmeink, “Radiopaths: Deep Multimodal Analysis on Chest Radiographs,” in IEEE Int. Conf. Big Data, Dec. 2022, pp. 3613–3621.
  3. A. Ayad, M. Frei, and A. Schmeink, “Efficient and Private ECG Classification on the Edge Using a Modified Split Learning Mechanism,” in IEEE Int. Conf. Healthcare Inform., Jun. 2022, pp. 01–06.
  4. D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 2, pp. 604–624, Feb. 2021.
  5. M. Barhoush, A. Hallawa, A. Peine, L. Martin, and A. Schmeink, “Localization-Driven Speech Enhancement in Noisy Multi-Speaker Hospital Environments Using Deep Learning and Meta Learning,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 31, pp. 670–683, 2023.
  6. J. Park, S. Samarakoon, M. Bennis, and M. Debbah, “Wireless Network Intelligence at the Edge,” Proc. IEEE, vol. 107, no. 11, pp. 2204–2239, Nov. 2019.
  7. Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing,” Proc. IEEE, vol. 107, no. 8, pp. 1738–1762, Aug. 2019.
  8. J. Park, S. Wang, A. Elgabli, S. Oh, E. Jeong, H. Cha, H. Kim, S.-L. Kim, and M. Bennis, “Distilling On-Device Intelligence at the Network Edge,” arXiv:1908.05895, Aug. 2019.
  9. H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proc. Int. Conf. Artif. Intell. and Statist., Apr. 2017, pp. 1273–1282.
  10. T. Nishio and R. Yonetani, “Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge,” in Proc. IEEE Int. Conf. Commun., May 2019, pp. 1–7.
  11. F. Haddadpour, M. M. Kamani, A. Mokhtari, and M. Mahdavi, “Federated Learning with Compression: Unified Analysis and Sharp Guarantees,” in Proc. Int. Conf. Artif. Intell. and Statist., Mar. 2021, pp. 2350–2358.
  12. P. Vepakomma, O. Gupta, T. Swedish, and R. Raskar, “Split learning for health: Distributed deep learning without sharing raw patient data,” arXiv:1812.00564, Dec. 2018.
  13. M. Poirot, P. Vepakomma, K. Chang, J. Kalpathy-Cramer, R. Gupta, and R. Raskar, “Split Learning for collaborative deep learning in healthcare,” arXiv:1912.12115, Dec. 2019.
  14. C. Thapa, M. A. P. Chamikara, and S. A. Camtepe, “Advancements of Federated Learning Towards Privacy Preservation: From Federated Learning to Split Learning,” Federated Learning Systems: Towards Next-Generation AI, pp. 79–109, 2021.
  15. Y. Li and X. Lyu, “Convergence analysis of sequential federated learning on heterogeneous data,” in Proc. Int. Conf. on Neural Inf. Process. Syst., May 2024, pp. 56 700–56 755.
  16. C. Thapa, P. C. Mahawaga Arachchige, S. Camtepe, and L. Sun, “SplitFed: When Federated Learning Meets Split Learning,” AAAI, vol. 36, pp. 8485–8493, Jun. 2022.
  17. M. Gawali, C. S. Arvind, S. Suryavanshi, H. Madaan, A. Gaikwad, K. N. Bhanu Prakash, V. Kulkarni, and A. Pant, “Comparison of Privacy-Preserving Distributed Deep Learning Methods in Healthcare,” in Medical Image Understanding and Analysis, Jul. 2021, pp. 457–471.
  18. Y. Gao, M. Kim, C. Thapa, A. Abuadbba, Z. Zhang, S. Camtepe, H. Kim, and S. Nepal, “Evaluation and Optimization of Distributed Machine Learning Techniques for Internet of Things,” IEEE Trans. Comput., vol. 71, no. 10, pp. 2538–2552, Oct. 2022.
  19. J. Jeon and J. Kim, “Privacy-Sensitive Parallel Split Learning,” in Int. Conf. on Inf. Netw., Jan. 2020, pp. 7–9.
  20. W. Wu, M. Li, K. Qu, C. Zhou, X. Shen, W. Zhuang, X. Li, and W. Shi, “Split Learning Over Wireless Networks: Parallel Design and Resource Management,” IEEE J. Select. Areas Commun., vol. 41, no. 4, pp. 1051–1066, Apr. 2023.
  21. Z. Lin, G. Zhu, Y. Deng, X. Chen, Y. Gao, K. Huang, and Y. Fang, “Efficient Parallel Split Learning over Resource-constrained Wireless Edge Networks,” arXiv:2303.15991, 2023.
  22. S. Oh, J. Park, P. Vepakomma, S. Baek, R. Raskar, M. Bennis, and S.-L. Kim, “LocFedMix-SL: Localize, Federate, and Mix for Improved Scalability, Convergence, and Latency in Split Learning,” Proc. of the ACM Web Conf., pp. 3347–3357, Apr. 2022.
  23. D.-J. Han, H. I. Bhatti, J. Lee, and J. Moon, “Accelerating Federated Learning with Split Learning on Locally Generated Losses,” in ICML Workshop on Federated Learn. for User Privacy and Data Confidentiality., Jul. 2021.
  24. S. Pal, M. Uniyal, J. Park, P. Vepakomma, R. Raskar, M. Bennis, M. Jeon, and J. Choi, “Server-Side Local Gradient Averaging and Learning Rate Acceleration for Scalable Split Learning,” arXiv:2112.05929, Dec. 2021.
  25. K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, “Speeding Up Distributed Machine Learning Using Codes,” IEEE Trans. Inform. Theory, vol. 64, no. 3, pp. 1514–1529, Mar. 2018.
  26. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” J. of mach. learn. res., vol. 3, pp. 993–1022, Jan. 2003.
  27. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data Via the EM Algorithm,” J. of the Roy. Statist. Soc., vol. 39, no. 1, pp. 1–22, Sep. 1977.
  28. X. Lyu, S. Liu, J. Liu, and C. Ren, “Scalable Aggregated Split Learning for Data-Driven Edge Intelligence on Internet-of-Things,” IEEE Internet of Things Magazine, vol. 6, no. 4, pp. 124–129, Dec. 2023.
  29. N. Golmant, N. Vemuri, Z. Yao, V. Feinberg, A. Gholami, K. Rothauge, M. W. Mahoney, and J. E. Gonzalez, “On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent,” arXiv:1811.12941, Sep. 2018.
  30. D. Wilson and T. R. Martinez, “The general inefficiency of batch training for gradient descent learning,” Neural Networks, vol. 16, no. 10, pp. 1429–1451, Dec. 2003.
  31. Y. Cai and T. Wei, “Efficient Split Learning with Non-iid Data,” in IEEE Int. Conf. on Mobile Data Manage., Jun. 2022, pp. 128–136.
  32. M. Kim, A. DeRieux, and W. Saad, “A Bargaining Game for Personalized, Energy Efficient Split Learning over Wireless Networks,” in IEEE Wireless Commun. and Netw. Conf.   Glasgow, United Kingdom: IEEE, Mar. 2023, pp. 1–6.
  33. M. Zhang, J. Cao, Y. Sahni, X. Chen, and S. Jiang, “Resource-efficient Parallel Split Learning in Heterogeneous Edge Computing,” in Int. Conf. on Comput., Netw. and Commun.   Big Island, HI, USA: IEEE, Dec. 2024, pp. 794–798.
  34. E. Yang, D.-K. Kang, and C.-H. Youn, “BOA: Batch orchestration algorithm for straggler mitigation of distributed DL training in heterogeneous GPU cluster,” J. Supercomput, vol. 76, no. 1, pp. 47–67, Jan. 2020.
  35. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proc. IEEE Conf. on Comput. Vis. and Pattern Recognit., 2016, pp. 770–778.
  36. A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” University of Toronto, Technical Report, 2009.
  37. S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Int. Conf. on Mach. Learn., Feb. 2015, pp. 448–456.
  38. Y. Wu and K. He, “Group Normalization,” Int. J. Comput. Vis., vol. 128, no. 3, pp. 742–755, Mar. 2020.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets