Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models (2404.06448v2)
Abstract: Recently, there has been a surge in the development of advanced intelligent generative content (AIGC), especially LLMs. However, for many downstream tasks, it is necessary to fine-tune LLMs using private data. While federated learning offers a promising privacy-preserving solution to LLM fine-tuning, the substantial size of an LLM, combined with high computational and communication demands, makes it hard to apply to downstream tasks. More importantly, private edge servers often possess varying computing and network resources in real-world scenarios, introducing additional complexities to LLM fine-tuning. To tackle these problems, we design and implement an automated federated pipeline, named FedPipe, to fine-tune LLMs with minimal training cost but without adding any inference latency. FedPipe firstly identifies the weights to be fine-tuned based on their contributions to the LLM training. It then configures a low-rank adapter for each selected weight to train local low-rank adapters on an edge server, and aggregate local adapters of all edge servers to fine-tune the whole LLM. Finally, it appropriately quantizes the parameters of LLM to reduce memory space according to the requirements of edge servers. Extensive experiments demonstrate that FedPipe expedites the model training and achieves higher accuracy than state-of-the-art benchmarks.
- Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning. In Proc. of 11th AACL-IJCNLP. 7319–7328.
- signSGD: Compressed Optimisation for Non-Convex Problems. In Proc. of the 35th ICML. 560–569.
- Language Models are Few-Shot Learners. Proc. of the 34th NeurIPS, 1877–1901.
- Federated Few-Shot Learning for Mobile NLP. In Proceedings of the 29th MobiCom. 1–17.
- Efficient Federated Learning for Modern NLP. In Proc. of the 29th MobiCom. 1–16.
- Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization. In Proc. of the 28th EMNLP. 7871–7888.
- Confidant: Customizing Transformer-based LLMs via Collaborative Edge Training. arXiv preprint arXiv:2311.13381 (Nov. 2023).
- Palm: Scaling Language Modeling with Pathways. Journal of Machine Learning Research 24, 240 (Aug. 2023), 1–113.
- Online Federated Learning based Object Detection across Autonomous Vehicles in a Virtual World. In IEEE 20th CCNC. 919–920.
- FedGP: Buffer-based Gradient Projection for Continual Federated Learning. In Proc. of the 6th MLSys.
- 8-bit Optimizers via Block-wise Quantization. Proc. of the 10th ICLR.
- QLoRA: Efficient Finetuning of Quantized LLMs. Proc. of the 37th NeurIPS.
- From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models. arXiv preprint arXiv:2311.13063 (Nov. 2023).
- GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers. In Proc. of the 11th ICLR.
- More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server. In Proc. of the 27th NIPS. 1223–1231.
- FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout. Proc. of the 35th NeurIPS (Dec. 2021), 12876–12889.
- Parameter-Efficient Transfer Learning for NLP. In Proc. of the 36th ICML. 2790–2799.
- LoRA: Low-Rank Adaptation of Large Language Models. In Proc. of the 10th ICLR.
- Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving. arXiv preprint arXiv:2310.00013 (Oct. 2023).
- Collaborative Perception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities. arXiv preprint arXiv:2401.01544 (Jan. 2024).
- LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models. In Proc. of the 28th EMNLP. 5254–5276.
- Joint task and data oriented semantic communications: A deep separate source-channel coding scheme. IEEE Internet of Things Journal (Jul. 2023).
- FDAPT: Federated Domain-adaptive Pre-training for Language Models. arXiv preprint arXiv:2307.06933 (Nov. 2023).
- Scaling Laws for Neural Language Models. arXiv preprint arXiv:2001.08361 (Jan. 2020).
- Federated Benchmarking of Medical Artificial Intelligence with MedPerf. Nature Machine Intelligence 5, 7 (Jul. 2023), 799–810.
- Compacter: Efficient Low-Rank Hypercomplex Adapter Layers. Proc. of the 35th NeurIPS, 1022–1035.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proc. of 17th NAACL-HLT. 4171–4186.
- Caterina Labrín and Francisco Urdinez. 2020. Principal Component Analysis. In R for political data science. 375–393.
- Ken Lang. 1995. Newsweeder: Learning to Filter Netnews. (Jul. 1995), 331–339.
- PyramidFL: A Fine-grained Client Selection Framework for Efficient Federated Learning. In Proc. of the 28th MobiCom. 158–171.
- CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference. arXiv preprint arXiv:2401.11240 (2024).
- Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization. In Proc. of the 11th AACL-IJCNLP. 6524–6538.
- FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks. In Findings of the Association for Computational Linguistics: NAACL 2022. 157–175.
- AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration. arXiv preprint arXiv:2306.00978 (2023).
- FedSN: A General Federated Learning Framework over LEO Satellite Networks. arXiv preprint arXiv:2311.01483 (Nov. 2023).
- Pushing Large Language Models to the 6G Edge: Vision, Challenges, and Opportunities. arXiv preprint arXiv:2309.16739 (Sep. 2023).
- Split Learning in 6G Edge Networks. IEEE Wireless Communications (2024).
- AdaptSFL: Adaptive Split Federated Learning in Resource-constrained Edge Networks. arXiv preprint arXiv:2403.13101 (Mar. 2024).
- Efficient Parallel Split Learning over Resource-constrained Wireless Edge Networks. IEEE Transactions on Mobile Computing (2024).
- Federated Prompting and Chain-of-Thought Reasoning for Improving LLMs Answering. In The 16th KSEM. 3–11.
- Optimal Resource Allocation for U-shaped Parallel Split Learning. arXiv preprint arXiv:2308.08896 (Oct. 2023).
- Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proc. of the 20th ICAIS. 1273–1282.
- Importance Estimation for Neural Network Pruning. In 2019 IEEE/CVF Conference on CVPR. 11264–11272.
- Deep NLP-Based Co-evolvement for Synthesizing Code Analysis from Natural Language. In Proc. of the 30th CC. 141–152.
- The E2E Dataset: New Challenges For End-to-End Generation. In Proc. of the 18th SIGDIAL. 201–206.
- Efficient NLP Model Finetuning via Multistage Data Filtering. In Proc. of the 32nd IJCAI. 4091–4099.
- Harmony: Heterogeneous Multi-Modal Federated Learning through Disentangled Model Training. In Proc. of the 21st MobiSys. 530–543.
- ClusterFL: A Clustering-based Federated Learning System for Human Activity Recognition. ACM Transactions on Sensor Networks 19, 1 (Dec. 2022), 1–32.
- Flow: Per-instance Personalized Federated Learning. Proc. of the 37th NeurIPS (Dec. 2023).
- Exploring the Performance and Efficiency of Transformer Models for NLP on Mobile Devices. In 28th ISCC. 1–4.
- AdapterFusion: Non-Destructive Task Composition for Transfer Learning. In Proc. of the 16th EACL. 487–503.
- Language Models are Unsupervised Multitask Learners. OpenAI blog 1, 8 (Feb. 2019), 9.
- Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity. IEEE Journal on Selected Areas in Information Theory 3, 2 (Jun. 2022), 197–205.
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters. arXiv preprint arXiv:2311.03285 (Nov. 2023).
- FedBalancer: Data and Pace Control for Efficient Federated Learning on Heterogeneous Clients. In Proc. of the 20th MobiSys. 436–449.
- BalanceFL: Addressing Class Imbalance in Long-Tail Federated Learning. In 21st ACM/IEEE IPSN. 271–284.
- Training Neural Networks with Fixed Sparse Masks. Proc. of the 35th NeurIPS, 24193–24205.
- Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 (Feb. 2023).
- Federated Learning over Wireless Networks: Optimization Model Design and Analysis. In IEEE INFOCOM 2019-IEEE conference on computer communications. 1387–1395.
- Split Learning for Health: Distributed Deep Learning without Sharing Raw Patient Data. arXiv preprint arXiv:1812.00564 (Dec. 2018).
- Device Sampling for Heterogeneous Federated Learning: Theory, Algorithms, and Implementation. In IEEE INFOCOM 2021-IEEE Conference on Computer Communications. 1–10.
- Wide Compression: Tensor Ring Nets. In 2018 IEEE/CVF Conference on CVPR. 9329–9338.
- Stable and Low-precision Training for Large-Scale Vision-Language Models. Proc. of the 37th NeurIPS.
- Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization. In Proc. of the 35th ICML. 5325–5333.
- SAFA: A Semi-Asynchronous Protocol for Fast Federated Learning With Low Overhead. IEEE Trans. Comput. 70, 5 (May 2021), 655–668.
- Decentralized Training of Foundation Models in Heterogeneous Environments. Proc. of the 36th NeurIPS, 25464–25477.
- Graph Learning for Multi-Satellite Based Spectrum Sensing. In Proc. ICCT.
- PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance. In Proc. of the 39th ICML. 26809–26823.
- Gpt-fl: Generative pre-trained model-assisted federated learning. arXiv preprint arXiv:2306.02210 (Sep. 2023).
- TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with Adaptive Partial Training. In 2023 IEEE/CVF Conference on CVPR. 5063–5072.
- FedAC: A Adaptive Clustered Federated Learning Framework for Heterogeneous Data. arXiv preprint arXiv:2403.16460 (Mar. 2024).
- Zihan Fang (17 papers)
- Zheng Lin (104 papers)
- Zhe Chen (237 papers)
- Xianhao Chen (50 papers)
- Yue Gao (146 papers)
- Yuguang Fang (55 papers)