When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions (2306.15546v2)
Abstract: The intersection of the Foundation Model (FM) and Federated Learning (FL) provides mutual benefits, presents a unique opportunity to unlock new possibilities in AI research, and address critical challenges in AI and real-world applications. FL expands the availability of data for FMs and enables computation sharing, distributing the training process and reducing the burden on FL participants. It promotes collaborative FM development, democratizing the process and fostering inclusivity and innovation. On the other hand, FM, with its enormous size, pre-trained knowledge, and exceptional performance, serves as a robust starting point for FL, facilitating faster convergence and better performance under non-iid data. Additionally, leveraging FM to generate synthetic data enriches data diversity, reduces overfitting, and preserves privacy. By examining the interplay between FL and FM, this paper aims to deepen the understanding of their synergistic relationship, highlighting the motivations, challenges, and future directions. Through an exploration of the challenges faced by FL and FM individually and their interconnections, we aim to inspire future research directions that can further enhance both fields, driving advancements and propelling the development of privacy-preserving and scalable AI systems.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Gpt-fl: Generative pre-trained model-assisted federated learning. arXiv preprint arXiv:2306.02210, 2023.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- OpenAI. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Fedclip: Fast generalization and personalization for clip in federated learning. arXiv preprint arXiv:2302.13485, 2023.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
- Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Will we run out of data? an analysis of the limits of scaling datasets in machine learning. arXiv preprint arXiv:2211.04325, 2022.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Privacy-preserving federated brain tumour segmentation. In International Workshop on Machine Learning in Medical Imaging, pages 133–141. Springer, 2019.
- Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation. In International MICCAI Brainlesion Workshop, pages 92–104. Springer, 2018.
- A privacy-preserving hybrid federated learning framework for financial crime detection. arXiv preprint arXiv:2302.03654, 2023.
- Efficient and secure federated learning for financial applications. Applied Sciences, 13(10):5877, 2023.
- Performance optimization of federated person re-identification via benchmark analysis. In Proceedings of the 28th ACM International Conference on Multimedia, pages 955–963, 2020.
- Joint optimization in edge-cloud continuum for federated unsupervised person re-identification. In Proceedings of the 29th ACM International Conference on Multimedia, pages 433–441, 2021.
- Federated unsupervised domain adaptation for face recognition. In 2022 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2022.
- Optimizing performance of federated person re-identification: Benchmarking and analysis. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022.
- Fedfast: Going beyond average for faster training of federated recommender systems. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1234–1242, 2020.
- Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604, 2018.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- On the importance and applicability of pre-training for federated learning. In The Eleventh International Conference on Learning Representations, 2023.
- Where to begin? on the impact of pre-training and initialization in federated learning. arXiv preprint arXiv:2210.08090, 2022.
- Defending chatgpt against jailbreak attack via self-reminders. Nature Machine Intelligence, pages 1–11, 2023.
- Pushing the limits of chatgpt on nlp tasks. arXiv preprint arXiv:2306.09719, 2023.
- The curse of recursion: Training on generated data makes models forget. arXiv preprint arxiv:2305.17493, 2023.
- Ryan Browne. Twitter will start charging developers for api access as elon musk seeks to drive revenue. 2023.
- Brian Fung. Reddit sparks outrage after a popular app developer said it wants him to pay $20 million a year for data access. 2023.
- Paresh Dave. Stack overflow will charge ai giants for training data. 2023.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
- A pathway towards responsible ai generated content. arXiv preprint arXiv:2303.01325, 2023.
- Federated unlearning. arXiv preprint arXiv:2012.13891, 2020.
- Fast federated machine unlearning with nonlinear functional theory. 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html, 3(6):7, 2023.
- Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna. lmsys. org (accessed 14 April 2023), 2023.
- Lamda: our breakthrough conversation technology. Google AI Blog, 2021.
- Continual learning at the edge: Real-time training on smartphone devices. arXiv preprint arXiv:2105.13127, 2021.
- Chameleon: Dual memory replay for online continual learning on edge devices. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1–6. IEEE, 2023.
- Addressing catastrophic forgetting in federated class-continual learning. arXiv preprint arXiv:2303.06937, 2023.
- Snapfusion: Text-to-image diffusion model on mobile devices within two seconds. arXiv preprint arXiv:2306.00980, 2023.
- Hayden Field. Openai says chatgpt downtime caused by targeted attack. 2023.
- Federated learning in mobile edge networks: A comprehensive survey. IEEE Communications Surveys & Tutorials, 22(3):2031–2063, 2020.
- Smart multi-tenant federated learning. arXiv preprint arXiv:2207.04202, 2022.
- Communication efficiency in federated learning: Achievements and challenges. arXiv preprint arXiv:2107.10996, 2021.
- Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188, 2023.
- Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
- Alex Nichol. Dall·e 2 pre-training mitigations. 2022.
- Privacy and robustness in federated learning: Attacks and defenses. IEEE transactions on neural networks and learning systems, 2022.
- Towards fair and privacy-preserving federated deep models. IEEE Transactions on Parallel and Distributed Systems, 31(11):2524–2541, 2020.
- How to democratise and protect ai: Fair and differentially private decentralised deep learning. IEEE Transactions on Dependable and Secure Computing, 19(2):1003–1017, 2020.
- Collaborative fairness in federated learning. Federated Learning: Privacy and Incentive, pages 189–204, 2020.
- Collaborative machine learning with incentive-aware model rewards. In International conference on machine learning, pages 8927–8936. PMLR, 2020.
- Together. Releasing gpt-jt powered by open-source ai. 2022.
- Reduce communication costs and preserve privacy: Prompt tuning method in federated learning. arXiv preprint arXiv:2208.12268, 2022.
- Communication-efficient federated learning via knowledge distillation. Nature communications, 13(1):2032, 2022.
- Large scale distributed deep networks. Advances in neural information processing systems, 25, 2012.
- Gpipe: Efficient training of giant neural networks using pipeline parallelism. Advances in neural information processing systems, 32, 2019.
- Decentralized training of foundation models in heterogeneous environments. Advances in Neural Information Processing Systems, 35:25464–25477, 2022.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR, 2019.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Federated foundation models: Privacy-preserving and collaborative learning for large models. arXiv preprint arXiv:2305.11414, 2023.
- The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
- Fedprompt: Communication-efficient and privacy-preserving prompt tuning in federated learning. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
- Promptfl: Let federated participants cooperatively learn prompts instead of models–federated learning in age of foundation model. arXiv preprint arXiv:2208.11625, 2022.
- Resfed: Communication efficient federated learning by transmitting deep compressed residuals. arXiv preprint arXiv:2212.05602, 2022.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
- A white paper on neural network quantization. arXiv preprint arXiv:2106.08295, 2021.
- Offsite-tuning: Transfer learning without full model. arXiv preprint arXiv:2302.04870, 2023.
- Towards federated learning at scale: System design. Proceedings of machine learning and systems, 1:374–388, 2019.
- Fedml: A research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518, 2020.
- Papaya: Practical, private, and scalable federated learning. Proceedings of Machine Learning and Systems, 4:814–832, 2022.
- Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390, 2020.
- Fate: An industrial grade platform for collaborative learning with data protection. The Journal of Machine Learning Research, 22(1):10320–10325, 2021.
- Easyfl: A low-code federated learning platform for dummies. IEEE Internet of Things Journal, 2022.
- Federatedscope: A flexible federated learning platform for heterogeneity. Proceedings of the VLDB Endowment, 16(5):1059–1072, 2023.
- FedML TEAM. Releasing fedllm: Build your own large language models on proprietary data using the fedml platform. 2023.
- FATE Team. Fate-llm. 2023.
- Mecta: Memory-economic continual test-time model adaptation. In The Eleventh International Conference on Learning Representations, 2023.
- Fedaffect: Few-shot federated learning for facial expression recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4168–4175, 2021.
- A survey on heterogeneous federated learning. arXiv preprint arXiv:2210.04505, 2022.
- Practical attribute reconstruction attack against federated learning. IEEE Transactions on Big Data, 2022.
- Dense: Data-free one-shot federated learning. Advances in Neural Information Processing Systems, 35:21414–21428, 2022.
- Mitigating data heterogeneity in federated learning with data augmentation. arXiv preprint arXiv:2206.09979, 2022.
- Privacy for free: How does dataset condensation help privacy? In International Conference on Machine Learning, pages 5378–5396. PMLR, 2022.
- Differentially private diffusion models. arXiv preprint arXiv:2210.09929, 2022.
- Federated learning from pre-trained models: A contrastive learning approach. arXiv preprint arXiv:2209.10083, 2022.
- Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581, 2019.
- Knowledge distillation from multiple foundation models for end-to-end speech recognition. arXiv preprint arXiv:2303.10917, 2023.
- Mas: Towards resource-efficient federated multiple-task learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 23414–23424, 2023.
- Image as a foreign language: Beit pretraining for all vision and vision-language tasks. arXiv preprint arXiv:2208.10442, 2022.
- Florence-2: Advancing a unified representation for a variety of vision tasks. arXiv preprint arXiv:2311.06242, 2023.
- Next-chat: An lmm for chat, detection and segmentation. arXiv preprint arXiv:2311.04498, 2023.
- Diffusion art or digital forgery? investigating data replication in diffusion models. arXiv preprint arXiv:2212.03860, 2022.
- Understanding and mitigating copying in diffusion models. arXiv preprint arXiv:2305.20086, 2023.
- Matthew Butterick. Stable diffusion litigation. https://stablediffusionlitigation.com, 2023. Accessed: 2023-05-30.
- Alteration-free and model-agnostic origin attribution of generated images. arXiv preprint arXiv:2305.18439, 2023.
- Are you copying my model? protecting the copyright of large language models for eaas via backdoor watermark. arXiv preprint arXiv:2305.10036, 2023.
- Student surpasses teacher: Imitation attack for black-box nlp apis. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2849–2860, 2022.
- Cater: Intellectual property protection on text generation apis via conditional watermarks. In Advances in Neural Information Processing Systems, 2023.
- Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402, 2022.
- How to sift out a clean data subset in the presence of data poisoning? arXiv preprint arXiv:2210.06516, 2022.
- Agnostic federated learning. In International Conference on Machine Learning, pages 4615–4625. PMLR, 2019.
- Mocosfl: enabling cross-client collaborative self-supervised learning. In Workshop on Federated Learning: Recent Advances and New Challenges (in Conjunction with NeurIPS 2022).
- Outsourcing training without uploading data via efficient collaborative open-source sampling. arXiv preprint arXiv:2210.12575, 2022.
- Optimizing federated unsupervised person re-identification via camera-aware clustering. In 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), pages 1–6. IEEE, 2022.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
- Towards adversarially robust continual learning. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
- Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 513–520, 2011.
- Learning transferable features with deep adaptation networks. In International conference on machine learning, pages 97–105. PMLR, 2015.
- Towards unsupervised domain adaptation for deep face recognition under privacy constraints via federated learning. arXiv preprint arXiv:2105.07606, 2021.
- Is heterogeneity notorious? taming heterogeneity to handle test-time shift in federated learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Federated learning under partially disjoint data via manifold reshaping. Transactions on Machine Learning Research, 2023.
- Fedskip: Combatting statistical heterogeneity with federated skip aggregation. In International Conference on Data Mining, pages 131–140. IEEE, 2022.
- No one left behind: Inclusive federated learning over heterogeneous devices. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3398–3406, 2022.
- Towards personalized federated learning via heterogeneous model reassembly. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Protecting intellectual property of language generation apis with lexical watermark. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10758–10766, 2022.
- Where did i come from? origin attribution of ai-generated images. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- How to detect unauthorized data usages in text-to-image diffusion models. arXiv preprint arXiv:2307.03108, 2023.
- Privacy assessment on reconstructed images: Are existing evaluation metrics faithful to human perception? In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
- Crypten: Secure multi-party computation meets machine learning. Advances in Neural Information Processing Systems, 34:4961–4973, 2021.
- How to backdoor diffusion models? arXiv preprint arXiv:2212.05400, 2022.
- Defending against backdoor attacks in natural language generation. In AAAI, 2023.
- Calfat: Calibrated federated adversarial training with label skewness. Advances in Neural Information Processing Systems, 2022.
- Byzantine-robust learning on heterogeneous data via gradient splitting. In ICML, 2023.
- Delving into the adversarial robustness of federated learning. arXiv preprint arXiv:2302.09479, 2023.
- On the robustness of chatgpt: An adversarial and out-of-distribution perspective. arXiv preprint arXiv:2302.12095, 2023.
- Towards personalized federated learning. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Federated learning with non-iid data. CoRR, abs/1806.00582, 2018.
- Ideal: Query-efficient data-free learning from black-box models. In The Eleventh International Conference on Learning Representations, 2023.
- Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface. arXiv preprint arXiv:2303.17580, 2023.
- Weiming Zhuang (21 papers)
- Chen Chen (752 papers)
- Lingjuan Lyu (131 papers)