Fed-ensemble: Improving Generalization through Model Ensembling in Federated Learning (2107.10663v1)
Abstract: In this paper we propose Fed-ensemble: a simple approach that bringsmodel ensembling to federated learning (FL). Instead of aggregating localmodels to update a single global model, Fed-ensemble uses random permutations to update a group of K models and then obtains predictions through model averaging. Fed-ensemble can be readily utilized within established FL methods and does not impose a computational overhead as it only requires one of the K models to be sent to a client in each communication round. Theoretically, we show that predictions on newdata from all K models belong to the same predictive posterior distribution under a neural tangent kernel regime. This result in turn sheds light onthe generalization advantages of model averaging. We also illustrate thatFed-ensemble has an elegant Bayesian interpretation. Empirical results show that our model has superior performance over several FL algorithms,on a wide range of data sets, and excels in heterogeneous settings often encountered in FL applications.
- Federated learning based on dynamic regularization. In International Conference on Learning Representations, 2021.
- On exact computation with an infinitely wide neural net. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32, pages 8141–8150. Curran Associates, Inc., 2019.
- Towards federated learning at scale: System design. In MLSys, 2019.
- Walter De Brouwer. The federated future is ready for shipping. https://medium.com/_doc_ai/the-federated-future-is-ready-for-shipping-d17ff40f43e3. Accessed: 2020-07-18.
- Fed{be}: Making bayesian model ensemble applicable to federated learning. In International Conference on Learning Representations, 2021.
- Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135. PMLR, 2017.
- Deep ensembles: A loss landscape perspective. In Arxiv, 2020.
- Loss surfaces, mode connectivity, and fast ensembling of dnns. In 32nd Conference on Neural Information Processing Systems, 2018.
- Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010.
- Bayesian deep learning and a probabilistic perspective of generalization. In Arxiv, 2020.
- Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604, 2018.
- Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In ICCV, 2015.
- Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition, 2016.
- Fl-ntk: A neural tangent kernel-based framework for federated learning convergence analysis. In Arxiv, 2021.
- The asymptotic spectrum of the hessian of dnn throughout training. In International Conference on Learning Representations, 2020.
- Neural tangent kernel: Convergence and generalization in neural networks. NeurIPS, 2018.
- Improving federated learning personalization via model agnostic meta learning. arXiv preprint arXiv:1909.12488, 2019.
- Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977, 2019.
- Bayesian model-agnostic meta-learning. NeurIPS, 2018.
- The open images dataset v4. International Journal of Computer Vision, pages 1–26, 2020.
- FedScale: Benchmarking model and system performance of federated learning. In Arxiv arxiv.org/abs/2105.11367, 2021.
- Wide neural networks of any depth evolve as linear models under gradient descent. In 33rd Conference on Neural Information Processing Systems, 2019.
- Federated optimization in heterogeneous networks. Proceedings of the 3rd MLSys Conference, 2018.
- Stein variational gradient descent: a general purpose bayesian inference algorithm. In NeurIPS, 2016.
- A double residual compression algorithm for efficient distributed learning. In Arxiv, 2019.
- Uncertainty in gradient boosting via ensembles. In International Conference on Learning Representations, 2021.
- Sample-then-optimize posterior sampling for bayesian linear models. 2017.
- Communication-efficient learning of deep networks from decentralized data. AISTATS, 2017.
- Distributed learning with compressed gradient differences. In Arxiv, 2019.
- A unifying view on dataset shift in classification. Pattern recognition, 45(1):521–530, 2012.
- Client selection for federated learning with heterogeneous resources in mobile edge. In ICC 2019-2019 IEEE International Conference on Communications (ICC), pages 1–7. IEEE, 2019.
- Fedsplit: An algorithmic framework for fast federated optimization. In NeurIPS, 2020.
- Amortized bayesian meta-learning. ICLR, 2019.
- Adaptive federated optimization. In International Conference on Learning Representations, 2021.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR, 2018.
- Federated multi-task learning. In Advances in Neural Information Processing Systems, pages 4424–4434, 2017.
- Federated learning with matched averaging. In International Conference on Learning Representations, 2020.
- Federated evaluation of on-device personalization. arXiv preprint arXiv:1910.10252, 2019.
- Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–19, 2019.
- Federated accelerated stochastic gradient descent. In NeurIPS, 2020.
- Bayesian nonparametric federated learning of neural networks. ICLR, 2019.
- Shufflenet: An extremely efficient convolutional neural network for mobile devices. In CVPR, 2018.
- Fedpd: A federated learning framework with optimal rates and adaptivity to non-iid data. In Arxiv, 2020.