Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning (2302.13485v2)

Published 27 Feb 2023 in cs.LG and cs.AI

Abstract: Federated learning (FL) has emerged as a new paradigm for privacy-preserving computation in recent years. Unfortunately, FL faces two critical challenges that hinder its actual performance: data distribution heterogeneity and high resource costs brought by large foundation models. Specifically, the non-IID data in different clients make existing FL algorithms hard to converge while the high resource costs, including computational and communication costs that increase the deployment difficulty in real-world scenarios. In this paper, we propose an effective yet simple method, named FedCLIP, to achieve fast generalization and personalization for CLIP in federated learning. Concretely, we design an attention-based adapter for the large model, CLIP, and the rest operations merely depend on adapters. Lightweight adapters can make the most use of pretrained model information and ensure models be adaptive for clients in specific tasks. Simultaneously, small-scale operations can mitigate the computational burden and communication burden caused by large models. Extensive experiments are conducted on three datasets with distribution shifts. Qualitative and quantitative results demonstrate that FedCLIP significantly outperforms other baselines (9% overall improvements on PACS) and effectively reduces computational and communication costs (283x faster than FedAVG). Our code will be available at: https://github.com/microsoft/PersonalizedFL.

FedCLIP: Efficient Federated Learning for CLIP

The paper presents FedCLIP, a method focusing on enhancing generalization and personalization of the Contrastive Language-Image Pre-training (CLIP) model in federated learning (FL) environments. The motivation arises from two pivotal challenges: heterogeneity in data distribution and substantial resource demands of large foundation models. Both factors impede the applicability and efficiency of conventional FL approaches.

Key Contributions and Methodology

FedCLIP proposes an innovative approach by introducing an attention-based adapter, termed AttAI, specifically for the CLIP image encoder. This adapter serves two main purposes: concentrating on relevant features of the pretrained model and minimizing computational and communication overheads by obviating the need for full model updates. This understanding and usage of pretrained models' inherent capabilities yield substantial efficiency without compromising performance.

  1. Pretrained Models Leverage: FedCLIP capitalizes on pretrained CLIP models to extract generalized and diversified features. The AttAI adapter is trained locally, focusing the model's attention on task-specific features while reducing data redundancy and preserving valuable prior knowledge.
  2. Adapter Efficiency: Unlike overarching network updates, FedCLIP merely exchanges parameters of the adapter, utilizing fewer trainable parameters. As a result, it offers a reduction in computational costs, achieving 283 times faster performance than traditional FedAVG.
  3. Experimental Verification: The method's effectiveness is confirmed through extensive experimentation on datasets such as PACS, VLCS, and Office-Home. FedCLIP consistently outperformed baseline methods with significant improvements in both generalization (approximately 9% improvement on PACS overall) and personalization.

Implications and Future Prospects

FedCLIP's innovative use of adapters in FL presents valuable implications:

  • Resource Efficiency: By drastically reducing the number of trainable parameters, FedCLIP aligns with realistic computational constraints, making FL more viable in resource-limited environments.
  • Scalability and Deployment: FedCLIP's extensibility suggests its potential application across varied architectures beyond CLIP, like BERT and ViT, illustrating its flexibility across tasks and models.
  • Foundation for Future Research: While it effectively addresses generalization and personalization, it opens pathways for further exploration into the design of task-specific adaptive structures and their integration into diverse FL scenarios.

Conclusion

FedCLIP stands as a significant advancement in federated learning using large models like CLIP. Its contribution through efficient generalization and personalization epitomizes a pragmatic step in utilizing foundation models within constrained resources. As federated learning continues to expand, innovations like FedCLIP will be crucial in meeting both practical and theoretical challenges in the domain. Future efforts will likely focus on further minuscule adjustments to the adapter design for enhanced task adaptability and reduced computational demands.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. The history began from alexnet: A comprehensive survey on deep learning approaches. arXiv preprint arXiv:1803.01164, 2018.
  2. Invariant risk minimization. stat, 1050:27, 2020.
  3. Federated learning review: Fundamentals, enabling technologies, and future applications. Information processing & management, 59(6):103061, 2022.
  4. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
  5. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  6. Improving generalization in federated learning by seeking flat minima. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIII, pages 654–672. Springer, 2022.
  7. On bridging generic and personalized federated learning for image classification. In International Conference on Learning Representations, 2022.
  8. Metafed: Federated learning among federations with cyclic knowledge distillation for personalized healthcare. arXiv preprint arXiv:2206.08516, 2022.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
  10. Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In Proceedings of the IEEE International Conference on Computer Vision, pages 1657–1664, 2013.
  11. Sharpness-aware minimization for efficiently improving generalization. In International Conference on Learning Representations, 2021.
  12. Feddc: Federated learning with non-iid data via local drift decoupling and correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10112–10121, 2022.
  13. M. W. Gardner and S. Dorling. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment, 32(14-15):2627–2636, 1998.
  14. Promptfl: Let federated participants cooperatively learn prompts instead of models–federated learning in age of foundation model. arXiv preprint arXiv:2208.11625, 2022.
  15. Fl games: A federated learning framework for distribution shifts. In Workshop on Federated Learning: Recent Advances and New Challenges (in Conjunction with NeurIPS 2022), 2022.
  16. A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 45(1):87–110, 2022.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  18. Exploiting adapters for cross-lingual low-resource speech recognition. IEEE ACM Trans. Audio Speech Lang. Process., 30:317–329, 2022.
  19. Self-challenging improves cross-domain generalization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 124–140. Springer, 2020.
  20. N. Inkster. China’s cyber power. Routledge, 2018.
  21. Imagenet classification with deep convolutional neural networks. In NeurIPS, volume 25, pages 1097–1105, 2012.
  22. Image-free domain generalization via clip for 3d hand pose estimation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2934–2944, 2023.
  23. Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017.
  24. A review of applications in federated learning. Computers & Industrial Engineering, 149:106854, 2020.
  25. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429–450, 2020.
  26. Fedbn: Federated learning on non-iid features via local batch normalization. In International Conference on Learning Representations, 2021.
  27. From distributed machine learning to federated learning: A survey. Knowledge and Information Systems, 64(4):885–917, 2022.
  28. Sphereface revived: Unifying hyperspherical face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):2458–2474, 2022.
  29. Cross-domain activity recognition via substructural optimal transport. Neurocomputing, 454:65–75, 2021.
  30. Local and global alignments for generalizable sensor-based human activity recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  31. Semantic-discriminative mixup for generalizable sensor-based cross-domain activity recognition. IMWUT, 2022.
  32. Personalized federated learning with adaptive batchnorm for healthcare. IEEE Transactions on Big Data, 2022.
  33. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273–1282. PMLR, 2017.
  34. Anam-net: Anamorphic depth embedding-based lightweight cnn for segmentation of anomalies in covid-19 chest ct images. IEEE Transactions on Neural Networks and Learning Systems, 32(3):932–946, 2021.
  35. Generalized federated learning via sharpness aware minimization. In International Conference on Machine Learning, pages 18250–18280. PMLR, 2022.
  36. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  37. Improving language understanding by generative pre-training. 2018.
  38. Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
  39. Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges. Information Fusion, 90:148–173, 2023.
  40. Braintorrent: A peer-to-peer environment for decentralized federated learning. arXiv, 2019.
  41. I. H. Sarker. Machine learning: Algorithms, real-world applications and research directions. SN computer science, 2(3):160, 2021.
  42. Robust and communication-efficient federated learning from non-iid data. IEEE transactions on neural networks and learning systems, 31(9):3400–3413, 2019.
  43. How to fine-tune bert for text classification? In Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings 18, pages 194–206. Springer, 2019.
  44. Gradient masked averaging for federated learning. arXiv preprint arXiv:2201.11986, 2022.
  45. Bert rediscovers the classical nlp pipeline. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4593–4601, 2019.
  46. Chatgpt: five priorities for research. Nature, 614(7947):224–226, 2023.
  47. Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5018–5027, 2017.
  48. P. Voigt and A. Von dem Bussche. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing, 10:3152676, 2017.
  49. Deep learning for sensor-based activity recognition: A survey. Pattern Recognition Letters, 119:3–11, 2019.
  50. Generalizing to unseen domains: A survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering, 2022.
  51. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–19, 2019.
  52. Salvaging federated learning by local adaptation. arXiv preprint arXiv:2002.04758, 2020.
  53. What do we mean by generalization in federated learning? In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
  54. Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF international conference on computer vision, pages 558–567, 2021.
  55. Z. Zhang and M. Sabuncu. Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Wang Lu (25 papers)
  2. Xixu Hu (6 papers)
  3. Jindong Wang (150 papers)
  4. Xing Xie (220 papers)
Citations (40)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets