FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering (2404.15384v1)
Abstract: Although large-scale pre-trained models hold great potential for adapting to downstream tasks through fine-tuning, the performance of such fine-tuned models is often limited by the difficulty of collecting sufficient high-quality, task-specific data. Federated Learning (FL) offers a promising solution by enabling fine-tuning across large-scale clients with a variety of task data, but it is bottlenecked by significant communication overhead due to the pre-trained models' extensive size. This paper addresses the high communication cost for fine-tuning large pre-trained models within FL frameworks through low-rank fine-tuning. Specifically, we train a low-rank adapter for each individual task on the client side, followed by server-side clustering for similar group of adapters to achieve task-specific aggregation. Extensive experiments on various language and vision tasks, such as GLUE and CIFAR-10/100, reveal the evolution of task-specific adapters throughout the FL training process and verify the effectiveness of the proposed low-rank task-specific adapter clustering (TAC) method.
- Intrinsic dimensionality explains the effectiveness of language model fine-tuning. In International Joint Conference on Natural Language Processing, pp. 7319–7328, 2021.
- Federated learning with hierarchical clustering of local updates to improve training on non-iid data. In International Joint Conference on Neural Networks, pp. 1–9, 2020.
- Do as i can, not as i say: Grounding language in robotic affordances. In Conference on Robot Learning, pp. 287–318, 2023.
- Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
- On the effectiveness of parameter-efficient fine-tuning. In AAAI Conference on Artificial Intelligence, pp. 12799–12807, 2023.
- Privacy-preserving heterogeneous federated transfer learning. In IEEE international Conference on Big Data, pp. 2552–2559, 2019.
- An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems, 33:19586–19597, 2020.
- Parameter-efficient transfer learning for NLP. In International Conference on Machine Learning, pp. 2790–2799, 2019.
- Federated visual classification with real-world data distribution. In European Conference on Computer Vision, pp. 76–92, 2020.
- Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2021.
- Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems, 34:1022–1035, 2021.
- Survey of personalization techniques for federated learning. In Conference on Smart Trends in Systems, Security and Sustainability, pp. 794–797, 2020.
- The power of scale for parameter-efficient prompt tuning. In Conference on Empirical Methods in Natural Language Processing, pp. 3045–3059, 2021.
- Prefix-tuning: Optimizing continuous prompts for generation. In International Joint Conference on Natural Language Processing, pp. 4582–4597, 2021.
- Federated learning in mobile edge networks: A comprehensive survey. IEEE Communications Surveys & Tutorials, 22(3):2031–2063, 2020.
- A secure federated transfer learning framework. IEEE Intelligent Systems, 35(4):70–82, 2020.
- Three approaches for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619, 2020.
- Safari: Sparsity-enabled federated learning with limited and unreliable communications. IEEE Transactions on Mobile Computing, 2023.
- Umap: Uniform manifold approximation and projection. Journal of Open Source Software, 3(29):861, 2018.
- Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pp. 1273–1282, 2017.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 32(8):3710–3722, 2020.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Glue: A multi-task benchmark and analysis platform for natural language understanding. In International Conference on Learning Representations, 2019.
- Fedlora: Model-heterogeneous personalized federated learning with lora tuning. arXiv preprint arXiv:2310.13283, 2023.
- Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In Annual Meeting of the Association for Computational Linguistics, pp. 1–9, 2022.
- The expressive power of low-rank adaptation. In Optimization for Machine Learning, 2023.
- Towards building the federated gpt: Federated instruction tuning. arXiv preprint arXiv:2305.05644, 2023.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022.
- Towards efficient communications in federated learning: A contemporary survey. Journal of the Franklin Institute, 2023.
- On the opportunities of green computing: A survey. arXiv preprint arXiv:2311.00447, 2023.
- Data-free knowledge distillation for heterogeneous federated learning. In International Conference on Machine Learning, pp. 12878–12889, 2021.