Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering (2404.15384v1)

Published 23 Apr 2024 in cs.LG and cs.AI

Abstract: Although large-scale pre-trained models hold great potential for adapting to downstream tasks through fine-tuning, the performance of such fine-tuned models is often limited by the difficulty of collecting sufficient high-quality, task-specific data. Federated Learning (FL) offers a promising solution by enabling fine-tuning across large-scale clients with a variety of task data, but it is bottlenecked by significant communication overhead due to the pre-trained models' extensive size. This paper addresses the high communication cost for fine-tuning large pre-trained models within FL frameworks through low-rank fine-tuning. Specifically, we train a low-rank adapter for each individual task on the client side, followed by server-side clustering for similar group of adapters to achieve task-specific aggregation. Extensive experiments on various language and vision tasks, such as GLUE and CIFAR-10/100, reveal the evolution of task-specific adapters throughout the FL training process and verify the effectiveness of the proposed low-rank task-specific adapter clustering (TAC) method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Intrinsic dimensionality explains the effectiveness of language model fine-tuning. In International Joint Conference on Natural Language Processing, pp.  7319–7328, 2021.
  2. Federated learning with hierarchical clustering of local updates to improve training on non-iid data. In International Joint Conference on Neural Networks, pp.  1–9, 2020.
  3. Do as i can, not as i say: Grounding language in robotic affordances. In Conference on Robot Learning, pp.  287–318, 2023.
  4. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
  5. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  6. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
  7. On the effectiveness of parameter-efficient fine-tuning. In AAAI Conference on Artificial Intelligence, pp.  12799–12807, 2023.
  8. Privacy-preserving heterogeneous federated transfer learning. In IEEE international Conference on Big Data, pp.  2552–2559, 2019.
  9. An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems, 33:19586–19597, 2020.
  10. Parameter-efficient transfer learning for NLP. In International Conference on Machine Learning, pp.  2790–2799, 2019.
  11. Federated visual classification with real-world data distribution. In European Conference on Computer Vision, pp.  76–92, 2020.
  12. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2021.
  13. Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems, 34:1022–1035, 2021.
  14. Survey of personalization techniques for federated learning. In Conference on Smart Trends in Systems, Security and Sustainability, pp.  794–797, 2020.
  15. The power of scale for parameter-efficient prompt tuning. In Conference on Empirical Methods in Natural Language Processing, pp.  3045–3059, 2021.
  16. Prefix-tuning: Optimizing continuous prompts for generation. In International Joint Conference on Natural Language Processing, pp.  4582–4597, 2021.
  17. Federated learning in mobile edge networks: A comprehensive survey. IEEE Communications Surveys & Tutorials, 22(3):2031–2063, 2020.
  18. A secure federated transfer learning framework. IEEE Intelligent Systems, 35(4):70–82, 2020.
  19. Three approaches for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619, 2020.
  20. Safari: Sparsity-enabled federated learning with limited and unreliable communications. IEEE Transactions on Mobile Computing, 2023.
  21. Umap: Uniform manifold approximation and projection. Journal of Open Source Software, 3(29):861, 2018.
  22. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pp.  1273–1282, 2017.
  23. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  24. Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 32(8):3710–3722, 2020.
  25. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  26. Glue: A multi-task benchmark and analysis platform for natural language understanding. In International Conference on Learning Representations, 2019.
  27. Fedlora: Model-heterogeneous personalized federated learning with lora tuning. arXiv preprint arXiv:2310.13283, 2023.
  28. Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In Annual Meeting of the Association for Computational Linguistics, pp.  1–9, 2022.
  29. The expressive power of low-rank adaptation. In Optimization for Machine Learning, 2023.
  30. Towards building the federated gpt: Federated instruction tuning. arXiv preprint arXiv:2305.05644, 2023.
  31. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022.
  32. Towards efficient communications in federated learning: A contemporary survey. Journal of the Franklin Institute, 2023.
  33. On the opportunities of green computing: A survey. arXiv preprint arXiv:2311.00447, 2023.
  34. Data-free knowledge distillation for heterogeneous federated learning. In International Conference on Machine Learning, pp.  12878–12889, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets