Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FedFisher: Leveraging Fisher Information for One-Shot Federated Learning (2403.12329v1)

Published 19 Mar 2024 in cs.LG, cs.DC, and stat.ML

Abstract: Standard federated learning (FL) algorithms typically require multiple rounds of communication between the server and the clients, which has several drawbacks, including requiring constant network connectivity, repeated investment of computational resources, and susceptibility to privacy attacks. One-Shot FL is a new paradigm that aims to address this challenge by enabling the server to train a global model in a single round of communication. In this work, we present FedFisher, a novel algorithm for one-shot FL that makes use of Fisher information matrices computed on local client models, motivated by a Bayesian perspective of FL. First, we theoretically analyze FedFisher for two-layer over-parameterized ReLU neural networks and show that the error of our one-shot FedFisher global model becomes vanishingly small as the width of the neural networks and amount of local training at clients increases. Next, we propose practical variants of FedFisher using the diagonal Fisher and K-FAC approximation for the full Fisher and highlight their communication and compute efficiency for FL. Finally, we conduct extensive experiments on various datasets, which show that these variants of FedFisher consistently improve over competing baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Git re-basin: Merging models modulo permutation symmetries. In The Eleventh International Conference on Learning Representations, 2023.
  2. Wasserstein barycenter-based model fusion and linear mode connectivity of neural networks. arXiv preprint arXiv:2210.06671, 2022.
  3. Federated learning via posterior averaging: A new perspective and practical algorithms. In 9th International Conference on Learning Representations, 2021.
  4. Learning and generalization in overparameterized neural networks, going beyond two layers. Advances in Neural Information Processing Systems, 32, 2019a.
  5. A convergence theory for deep learning via over-parameterization. In International Conference on Machine Learning, pages 242–252. PMLR, 2019b.
  6. One-shot federated learning for model clustering and learning in heterogeneous environments. arXiv preprint arXiv:2209.10866, 2022.
  7. Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks. In International Conference on Machine Learning, pages 322–332. PMLR, 2019.
  8. Practical secure aggregation for federated learning on user-held data. In NeurIPS Workshop on Private Multi-Party Machine Learning, 2016.
  9. On the importance and applicability of pre-training for federated learning. In The Eleventh International Conference on Learning Representations, 2022.
  10. Fusing finetuned models for better pretraining. arXiv preprint arXiv:2204.03044, 2022.
  11. CINIC-10 is not Imagenet or CIFAR-10. arXiv preprint arXiv:1810.03505, 2018.
  12. Adaptive personalized federated learning, 2020.
  13. Heterogeneity for the win: One-shot federated clustering. In International Conference on Machine Learning, pages 2611–2620. PMLR, 2021.
  14. Towards addressing label skews in one-shot federated learning. In The Eleventh International Conference on Learning Representations, 2023.
  15. Gradient descent provably optimizes over-parameterized neural networks. In 7th International Conference on Learning Representations, 2019.
  16. The role of permutation invariance in linear mode connectivity of neural networks. In The Tenth International Conference on Learning Representations, 2022.
  17. Weighting schemes for one-shot federated learning. 2022.
  18. Thomas George. NNGeometry: Easy and Fast Fisher Information Matrices and Neural Tangent Kernels in PyTorch, February 2021. URL https://doi.org/10.5281/zenodo.4532597.
  19. Ensemble attention distillation for privacy-preserving federated learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15076–15086, 2021.
  20. A kronecker-factored approximate fisher matrix for convolution layers. In International Conference on Machine Learning, pages 573–582. PMLR, 2016.
  21. One-shot federated learning. arXiv preprint arXiv:1902.11175, 2019.
  22. Federated learning as variational inference: A scalable expectation propagation approach. In The Eleventh International Conference on Learning Representations, 2023.
  23. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  24. Data-free one-shot federated learning under very high statistical heterogeneity. In The Eleventh International Conference on Learning Representations, 2023.
  25. Deep models under the gan: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 603–618, 2017.
  26. Measuring the effects of non-identical data distribution for federated visual classification. In International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with NeurIPS 2019 (FL-NeurIPS’19), December 2019.
  27. Fl-ntk: A neural tangent kernel-based framework for federated learning analysis. In International Conference on Machine Learning, pages 4423–4434. PMLR, 2021.
  28. Editing models with task arithmetic. In The Eleventh International Conference on Learning Representations, 2023.
  29. Dataless knowledge fusion by merging weights of language models. In The Eleventh International Conference on Learning Representations, 2023.
  30. REPAIR: renormalizing permuted activations for interpolation repair. In The Eleventh International Conference on Learning Representations, 2023.
  31. Fastsecagg: Scalable secure aggregation for privacy-preserving federated learning. arXiv preprint arXiv:2009.11248, 2020.
  32. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
  33. Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pages 5132–5143. PMLR, 2020.
  34. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  35. Cifar-100 (canadian institute for advanced research). http://www.cs.toronto.edu/~kriz/cifar.html, 2009.
  36. A large-scale study on regularization and normalization in gans. In International Conference on Machine Learning, pages 3581–3590. PMLR, 2019.
  37. Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  38. Yann LeCun. The MNIST database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
  39. Optimal brain damage. Advances in Neural Information Processing Systems, 2, 1989.
  40. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  41. Practical one-shot federated learning for cross-silo setting. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21), May 2021.
  42. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60, 2020a.
  43. Federated optimization for heterogeneous networks. In Proceedings of the 3rd MLSys Conference, January 2020b.
  44. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems, 33:2351–2363, 2020.
  45. Deep neural network fusion via graph matching with applications to model ensemble and federated learning. In International Conference on Machine Learning, pages 13857–13869. PMLR, 2022.
  46. James Martens. New insights and perspectives on the natural gradient method. The Journal of Machine Learning Research, 21(1):5776–5851, 2020.
  47. Optimizing neural networks with kronecker-factored approximate curvature. In International Conference on Machine Learning, pages 2408–2417. PMLR, 2015.
  48. Merging models with fisher-weighted averaging. Advances in Neural Information Processing Systems, 35:17703–17716, 2022.
  49. Communication-Efficient Learning of Deep Networks from Decentralized Data. International Conference on Artificial Intelligenece and Statistics (AISTATS), April 2017. URL https://arxiv.org/abs/1602.05629.
  50. Foundations of machine learning. MIT press, 2018.
  51. Reading digits in natural images with unsupervised feature learning. In NeurIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
  52. Where to begin? exploring the impact of pre-training and initialization in federated learning. arXiv preprint arXiv:2206.15387, 2022.
  53. Adaptive federated optimization. In International Conference on Learning Representations (ICLR), 2021.
  54. Xor mixup: Privacy-preserving data augmentation for one-shot federated learning. arXiv preprint arXiv:2006.05148, 2020.
  55. Woodfisher: Efficient second-order approximation for neural network compression. Advances in Neural Information Processing Systems, 33:18098–18109, 2020.
  56. Model fusion via optimal transport. Advances in Neural Information Processing Systems, 33:22045–22055, 2020.
  57. Fedavg converges to zero training loss linearly: The power of overparameterized multi-layer neural networks.
  58. The german traffic sign recognition benchmark: A multi-class classification competition. The 2011 International Joint Conference on Neural Networks, pages 1453–1460, 2011. URL https://api.semanticscholar.org/CorpusID:15926837.
  59. Federated learning from pre-trained models: A contrastive learning approach. Advances in Neural Information Processing Systems, 35:19332–19344, 2022.
  60. Federated learning with matched averaging. In 8th International Conference on Learning Representations, 2020a.
  61. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in Neural Information Processing Systems, 33:7611–7623, 2020b.
  62. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. https://arxiv.org/abs/1708.07747, aug 2017.
  63. Resolving interference when merging models. arXiv preprint arXiv:2306.01708, 2023.
  64. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):12, 2019.
  65. Tct: Convexifying federated learning using bootstrapped neural tangent kernels. Advances in Neural Information Processing Systems, 35:30882–30897, 2022.
  66. Neural tangent kernel empowered federated learning. In International Conference on Machine Learning, pages 25783–25803. PMLR, 2022.
  67. Bayesian nonparametric federated learning of neural networks. In International Conference on Machine Learning, pages 7252–7261. PMLR, 2019.
  68. Dense: Data-free one-shot federated learning. Advances in Neural Information Processing Systems, 35:21414–21428, 2022.
  69. The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 253–261, 2020.
  70. Distilled one-shot federated learning. arXiv preprint arXiv:2009.07999, 2020.
  71. Data-free knowledge distillation for heterogeneous federated learning. In International Conference on Machine Learning, pages 12878–12889. PMLR, 2021.
  72. An improved analysis of training over-parameterized deep neural networks. Advances in Neural Information Processing Systems, 32, 2019.
Citations (4)

Summary

We haven't generated a summary for this paper yet.