Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LW-FedSSL: Resource-efficient Layer-wise Federated Self-supervised Learning (2401.11647v3)

Published 22 Jan 2024 in cs.LG and cs.AI

Abstract: Many studies integrate federated learning (FL) with self-supervised learning (SSL) to take advantage of raw data distributed across edge devices. However, edge devices often struggle with high computation and communication costs imposed by SSL and FL algorithms. To tackle this hindrance, we propose LW-FedSSL, a layer-wise federated self-supervised learning approach that allows edge devices to incrementally train a single layer of the model at a time. We introduce server-side calibration and representation alignment mechanisms to ensure LW-FedSSL delivers performance on par with conventional federated self-supervised learning (FedSSL) while significantly lowering resource demands. In a pure layer-wise training scheme, training one layer at a time may limit effective interaction between different layers of the model. The server-side calibration mechanism takes advantage of the resource-rich FL server to ensure smooth collaboration between different layers of the global model. During local training, the representation alignment mechanism encourages closeness between representations of local models and those of the global model, thereby preserving the layer cohesion established by server-side calibration. With the proposed mechanisms, LW-FedSSL achieves a $3.3 \times$ reduction in memory usage, $2.1 \times$ fewer computational operations (FLOPs), and a $3.2 \times$ lower communication cost while maintaining the same level of performance as its end-to-end training counterpart. Additionally, we explore a progressive training strategy called Prog-FedSSL, which matches end-to-end training in memory requirements but offers a $1.8 \times$ reduction in FLOPs and communication costs. Although Prog-FedSSL is not as resource-efficient as LW-FedSSL, its performance improvements make it a suitable candidate for FL environments with more lenient resource constraints.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Learning representations by maximizing mutual information across views. Advances in neural information processing systems, 32, 2019.
  2. Greedy layer-wise training of deep networks. In Proceedings of the 19th International Conference on Neural Information Processing Systems, NIPS’06, page 153–160, Cambridge, MA, USA, 2006. MIT Press.
  3. Expanding the reach of federated learning by reducing client resource requirements. arXiv preprint arXiv:1812.07210, 2018.
  4. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15750–15758, 2021.
  5. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  6. An empirical study of training self-supervised vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9620–9629, Los Alamitos, CA, USA, oct 2021. IEEE Computer Society.
  7. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 215–223. JMLR Workshop and Conference Proceedings, 2011.
  8. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
  9. Thomas S Ferguson. A bayesian analysis of some nonparametric problems. The annals of statistics, pages 209–230, 1973.
  10. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677, 2017.
  11. Towards federated learning under resource constraints via layer-wise training and depth dropout. In The 4th Workshop on practical ML for Developing Countries: learning under limited/low resource settings @ ICLR 2023, 2023.
  12. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  13. A fast learning algorithm for deep belief nets. Neural Comput., 18(7):1527–1554, jul 2006.
  14. Fjord: Fair and accurate federated learning under heterogeneous targets with ordered dropout. Advances in Neural Information Processing Systems, 34:12876–12889, 2021.
  15. Incremental layer-wise self-supervised learning for efficient speech domain adaptation on device. arXiv preprint arXiv:2110.00155, 2021.
  16. Revisiting self-supervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1920–1929, 2019.
  17. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527, 2016.
  18. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009, 2009.
  19. Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.
  20. Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  21. Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10713–10722, June 2021.
  22. Mocosfl: Enabling cross-client collaborative self-supervised learning. In The Eleventh International Conference on Learning Representations, 2023.
  23. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
  24. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  25. Fedaux: Leveraging unlabeled auxiliary data in federated learning. IEEE Transactions on Neural Networks and Learning Systems, 2021.
  26. Representation learning with contrastive predictive coding. arXiv e-prints, pages arXiv–1807, 2018.
  27. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  28. Progfed: Effective, communication, and computation efficient federated learning by progressive training. In International Conference on Machine Learning, pages 23034–23054. PMLR, 2022.
  29. Does learning from decentralized non-IID unlabeled data benefit from self supervision? In The Eleventh International Conference on Learning Representations, 2023.
  30. Collaborative unsupervised visual representation learning from decentralized data. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4912–4921, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.