Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Personalized Federated Instruction Tuning via Neural Architecture Search (2402.16919v1)

Published 26 Feb 2024 in cs.LG

Abstract: Federated Instruction Tuning (FIT) has shown the ability to achieve collaborative model instruction tuning among massive data owners without sharing private data. However, it still faces two key challenges, i.e., data and resource heterogeneity. Due to the varying data distribution and preferences among data owners, FIT cannot adapt to the personalized data of individual owners. Moreover, clients with superior computational abilities are constrained since they need to maintain the same fine-tuning architecture as the weaker clients. To address these issues, we propose a novel Personalized Federated Instruction Tuning (PerFIT) framework based on architecture search. Specifically, PerFIT allows each client to search for a personalized architecture by expanding the trainable parameter space of the global model followed by pruning the parameters to the original state. This procedure allows personalized instruction fine-tuning within expanded parameter spaces, concurrently preserving the same number of trainable parameters. Furthermore, to release the abilities of heterogeneous computational resources and enhance the performance of personalization on local data, we exploit personalized parameter-wise aggregation. The evaluation with multiple LLMs non-IID scenarios demonstrates that compared to the state-of-the-art FIT methods, our approach can achieve up to a 23% decrease in perplexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, 2023.
  2. Free dolly: Introducing the world’s first truly open instruction-tuned llm, 2023.
  3. Recovering private text in federated learning of language models. Advances in Neural Information Processing Systems (NeurIPS), 35:8130–8143, 2022.
  4. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  5. Achieving personalized federated learning with sparse local models. arXiv preprint arXiv:2201.11380, 2022.
  6. Scalefl: Resource-adaptive federated learning with heterogeneous clients. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24532–24541, 2023.
  7. A survey on federated learning for resource-constrained iot devices. IEEE Internet of Things Journal, 9(1):1–24, 2021.
  8. Neural architecture search for parameter-efficient fine-tuning of large pre-trained language models. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, July 2023.
  9. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
  10. Fine-tuning language models with just forward passes. In Workshop on Efficient Systems for Foundation Models @ ICML2023, 2023.
  11. Neural architecture search without training. In Proceedings of International Conference on Machine Learning (ICML), pages 7588–7598, 2021.
  12. Local learning matters: Rethinking data heterogeneity in federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8397–8406, 2022.
  13. Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277, 2023.
  14. Personalized federated learning using hypernetworks. In Proceedings of International Conference on Machine Learning (ICML), pages 9489–9502, 2021.
  15. Personalized federated learning with moreau envelopes. Advances in Neural Information Processing Systems (NeurIPS), 33:21394–21405, 2020.
  16. Stanford alpaca: An instruction-following llama model, 2023.
  17. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  18. Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560, 2022.
  19. Finetuned language models are zero-shot learners. In Proceedings of International Conference on Learning Representations (ICLR), 2022.
  20. Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  21. Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244, 2023.
  22. Federated neural architecture search. arXiv preprint arXiv:2002.06352, 2020.
  23. Towards building the federatedGPT: Federated instruction tuning. In International Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Pengyu Zhang (26 papers)
  2. Yingbo Zhou (81 papers)
  3. Ming Hu (110 papers)
  4. Junxian Feng (1 paper)
  5. Jiawen Weng (3 papers)
  6. Mingsong Chen (53 papers)
Citations (2)