Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space (2312.09817v2)
Abstract: Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods which collect parameter samples from local posteriors, and aggregate them to approximate the global posterior. To improve scalability for larger models, one common Bayesian approach is to approximate the global predictive posterior by multiplying local predictive posteriors. In this work, we demonstrate that this method gives systematically overconfident predictions, and we remedy this by proposing $\beta$-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors, using a tunable parameter $\beta$. This parameter is tuned to improve the global ensemble's calibration, before it is distilled to a single model. Our method is evaluated on a variety of regression and classification datasets to demonstrate its superiority in calibration to other baselines, even as data heterogeneity increases. Code available at https://github.com/hasanmohsin/betaPredBayesFL
- Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms. In International Conference on Learning Representations.
- Bishop, C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Berlin, Heidelberg: Springer-Verlag. ISBN 0387310738.
- Fed{BE}: Making Bayesian Model Ensemble Applicable to Federated Learning. In International Conference on Learning Representations.
- On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis, 98(10): 1969–1987.
- EMNIST: Extending MNIST to handwritten letters. In 2017 international joint conference on neural networks (IJCNN), 2921–2926. IEEE.
- Modeling wine preferences by data mining from physicochemical properties. Decision support systems, 47(4): 547–553.
- A data mining approach to predict forest fires using meteorological data.
- On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario. Sensors and Actuators B: Chemical, 129(2): 750–757.
- Robust Bayesian Committee Machine for Large-Scale Gaussian Processes. In Large-Scale Kernel Machines Workshop at ICML 2015.
- Differential privacy for Bayesian inference through posterior sampling. Journal of machine learning research, 18(11): 1–39.
- UCI Machine Learning Repository.
- Event labeling combining ensemble detectors and background knowledge. Progress in Artificial Intelligence, 1–15.
- One-Shot Federated Learning.
- On calibration of modern neural networks.
- Distilling the Knowledge in a Neural Network. ArXiv, abs/1503.02531.
- Adam: A Method for Stochastic Optimization. In Bengio, Y.; and LeCun, Y., eds., 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Learning multiple layers of features from tiny images.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278–2324.
- Practical One-Shot Federated Learning for Cross-Silo Setting. In Zhou, Z.-H., ed., Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 1484–1490. International Joint Conferences on Artificial Intelligence Organization. Main Track.
- Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2: 429–450.
- FedBN: Federated Learning on Non-IID Features via Local Batch Normalization. In International Conference on Learning Representations.
- Generalized Robust Bayesian Committee Machine for Large-scale Gaussian Process Regression. In ICML, 3131–3140.
- Communication-Efficient Learning of Deep Networks from Decentralized Data. In Singh, A.; and Zhu, J., eds., Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, 1273–1282. PMLR.
- Agnostic federated learning. In International Conference on Machine Learning, 4615–4625. PMLR.
- Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty. arXiv preprint arXiv:2102.11582.
- Asymptotically Exact, Embarrassingly Parallel MCMC. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, UAI’14, 623–632. Arlington, Virginia, USA: AUAI Press. ISBN 9780974903910.
- Improving the identifiability of neural networks for Bayesian inference. In NeurIPS Workshop on Bayesian Deep Learning.
- Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press. ISBN 026218253X.
- Adaptive Federated Optimization. In International Conference on Learning Representations.
- Tresp, V. 2000. A Bayesian committee machine. Neural computation, 12(11): 2719–2741.
- Federated Learning with Matched Averaging. In International Conference on Learning Representations.
- Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33: 7611–7623.
- Minibatch vs Local SGD for Heterogeneous Distributed Learning. CoRR, abs/2006.04735.
- Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms.
- Building real estate valuation models with comparative approach through case-based reasoning. Applied Soft Computing, 65: 260–271.
- Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning. International Conference on Learning Representations.
- Federated learning with non-iid data. arXiv preprint arXiv:1806.00582.
- Applications of federated learning in smart cities: recent advances, taxonomy, and open challenges. Connection Science, 34(1): 1–28.