Papers
Topics
Authors
Recent
2000 character limit reached

Federated Generative Learning with Foundation Models (2306.16064v2)

Published 28 Jun 2023 in cs.LG and cs.AI

Abstract: Existing approaches in Federated Learning (FL) mainly focus on sending model parameters or gradients from clients to a server. However, these methods are plagued by significant inefficiency, privacy, and security concerns. Thanks to the emerging foundation generative models, we propose a novel federated learning framework, namely Federated Generative Learning. In this framework, each client can create text embeddings that are tailored to their local data, and send embeddings to the server. Then the informative training data can be synthesized remotely on the server using foundation generative models with these embeddings, which can benefit FL tasks. Our proposed framework offers several advantages, including increased communication efficiency, robustness to data heterogeneity, substantial performance improvements, and enhanced privacy protection. We validate these benefits through extensive experiments conducted on 12 datasets. For example, on the ImageNet100 dataset with a highly skewed data distribution, our method outperforms FedAvg by 12% in a single communication round, compared to FedAvg's performance over 200 communication rounds. We have released the code for all experiments conducted in this study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (74)
  1. Towards federated learning at scale: System design. Proceedings of machine learning and systems, 1:374–388, 2019.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  3. Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188, 2023.
  4. Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  4750–4759, 2022.
  5. Conceptual 12M: Pushing web-scale image-text pre-training to recognize long-tail visual concepts. In CVPR, 2021.
  6. Wireless communications for collaborative federated learning. IEEE Communications Magazine, 58(12):48–54, 2020.
  7. Personalized retrogress-resilient framework for real-world medical federated learning. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pp.  347–356. Springer, 2021.
  8. Fearh: Federated machine learning with anonymous random hybridization on electronic medical records. Journal of Biomedical Informatics, 117:103735, 2021.
  9. Mitigating data heterogeneity in federated learning with data augmentation. arXiv preprint arXiv:2206.09979, 2022.
  10. Heterogeneity for the win: One-shot federated clustering. In International Conference on Machine Learning, pp. 2611–2620. PMLR, 2021.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  12. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  13. Preserving privacy in federated learning with ensemble cross-domain knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  11891–11899, 2022.
  14. Promptfl: Let federated participants cooperatively learn prompts instead of models–federated learning in age of foundation model. arXiv preprint arXiv:2208.11625, 2022.
  15. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  16. Is synthetic data from generative models ready for image recognition? arXiv preprint arXiv:2210.07574, 2022.
  17. Jeremy Howard. A smaller subset of 10 easily classified classes from imagenet, and a little more french, 2019. URL https://github. com/fastai/imagenette.
  18. Differentially private federated learning for resource-constrained internet of things. arXiv preprint arXiv:2003.12705, 2020.
  19. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021a.
  20. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021b.
  21. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.
  22. Image captions are natural prompts for text-to-image models. arXiv preprint, 2023.
  23. Effective passive membership inference attacks in federated learning against overparameterized models. In The Eleventh International Conference on Learning Representations, 2023a.
  24. Model extraction attacks on split federated learning. arXiv preprint arXiv:2303.08581, 2023b.
  25. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597, 2023c.
  26. Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp.  965–978. IEEE, 2022.
  27. Fedbn: Federated learning on non-iid features via local batch normalization. arXiv preprint arXiv:2102.07623, 2021.
  28. Fedrs: Federated learning with restricted softmax for label distribution non-iid data. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp.  995–1005, 2021.
  29. Is synthetic data from diffusion models ready for knowledge distillation? arXiv preprint arXiv:2305.12954, 2023d.
  30. Fedrec++: Lossless federated recommendation with explicit feedback. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pp.  4224–4231, 2021.
  31. Fadl: Federated-autonomous deep learning for distributed electronic health record. arXiv preprint arXiv:1811.11400, 2018.
  32. Threats, attacks and defenses to federated learning: issues, taxonomy and perspectives. Cybersecurity, 5(1):1–19, 2022.
  33. Feddg: Federated domain generalization on medical image segmentation via episodic learning in continuous frequency space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  1013–1023, 2021a.
  34. Fedct: Federated collaborative transfer for recommendation. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pp.  716–725, 2021b.
  35. Fedcoin: A peer-to-peer payment system for federated learning. In Federated Learning: Privacy and Incentive, pp.  125–138. Springer, 2020.
  36. Federated learning for open banking. In Federated Learning: Privacy and Incentive, pp.  240–254. Springer, 2020.
  37. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp.  1273–1282. PMLR, 2017.
  38. A survey on security and privacy of federated learning. Future Generation Computer Systems, 115:619–640, 2021.
  39. Preserving privacy and security in federated learning. arXiv preprint arXiv:2202.03402, 2022.
  40. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  41. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE International Conference on Computer Vision, pp.  1406–1415, 2019.
  42. Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
  43. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
  44. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  10684–10695, June 2022a.
  45. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10684–10695, 2022b.
  46. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi: 10.1007/s11263-015-0816-y.
  47. White-box vs black-box: Bayes optimal strategies for membership inference. In International Conference on Machine Learning. PMLR, 2019.
  48. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  49. Ml-leaks: model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246, 2018.
  50. Federated learning with cooperating devices: A consensus approach for massive iot networks. IEEE Internet of Things Journal, 7(5):4641–4654, 2020.
  51. Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402, 2022.
  52. Federated learning in distributed medical databases: Meta-analysis of large-scale subcortical brain data. In 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019), pp.  270–274. IEEE, 2019.
  53. Diffusion art or digital forgery? investigating data replication in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  6048–6058, 2023a.
  54. Understanding and mitigating copying in diffusion models. arXiv preprint arXiv:2305.20086, 2023b.
  55. Systematic evaluation of privacy risks of machine learning models. In USENIX Security Symposium, 2021.
  56. Towards federated graph learning for collaborative financial crimes detection. arXiv preprint arXiv:1909.12946, 2019.
  57. Yfcc100m: The new data in multimedia research. Communications of the ACM, 59(2):64–73, 2016.
  58. Gerrit van den Burg and Chris Williams. On memorization in probabilistic deep generative models. Advances in Neural Information Processing Systems, 34:27916–27928, 2021.
  59. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020.
  60. Dataset distillation. arXiv preprint arXiv:1811.10959, 2018.
  61. Exploring one-shot semi-supervised federated learning with a pre-trained diffusion model. arXiv preprint arXiv:2305.04063, 2023.
  62. Federated multi-target domain adaptation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp.  1424–1433, 2022.
  63. Privacy risk in machine learning: analyzing the connection to overfitting. In IEEE computer security foundations symposium, 2018.
  64. Federated foundation models: Privacy-preserving and collaborative learning for large models. arXiv preprint arXiv:2305.11414, 2023.
  65. Forget-me-not: Learning to forget in text-to-image diffusion models. arXiv preprint arXiv:2303.17591, 2023a.
  66. Dense: Data-free one-shot federated learning. Advances in Neural Information Processing Systems, 35:21414–21428, 2022a.
  67. Federated learning with label distribution skew via logits calibration. In International Conference on Machine Learning, pp. 26311–26329. PMLR, 2022b.
  68. Delving into the adversarial robustness of federated learning. arXiv preprint arXiv:2302.09479, 2023b.
  69. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020.
  70. Federated learning with non-iid data. arXiv preprint arXiv:1806.00582, 2018.
  71. Distilled one-shot federated learning. arXiv preprint arXiv:2009.07999, 2020.
  72. Training on thin air: Improve image classification with generated data. arXiv preprint arXiv:2305.15316, 2023.
  73. Aligning before aggregating: Enabling cross-domain federated learning via consistent feature extraction. In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS), pp.  809–819. IEEE, 2022.
  74. Deep leakage from gradients. Advances in neural information processing systems, 32, 2019.
Citations (14)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.