Trustworthy Personalized Bayesian Federated Learning via Posterior Fine-Tune (2402.16911v1)
Abstract: Performance degradation owing to data heterogeneity and low output interpretability are the most significant challenges faced by federated learning in practical applications. Personalized federated learning diverges from traditional approaches, as it no longer seeks to train a single model, but instead tailors a unique personalized model for each client. However, previous work focused only on personalization from the perspective of neural network parameters and lack of robustness and interpretability. In this work, we establish a novel framework for personalized federated learning, incorporating Bayesian methodology which enhances the algorithm's ability to quantify uncertainty. Furthermore, we introduce normalizing flow to achieve personalization from the parameter posterior perspective and theoretically analyze the impact of normalizing flow on out-of-distribution (OOD) detection for Bayesian neural networks. Finally, we evaluated our approach on heterogeneous datasets, and the experimental results indicate that the new algorithm not only improves accuracy but also outperforms the baseline significantly in OOD detection due to the reliable output of the Bayesian approach.
- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–1282.
- Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with non-iid data,” arXiv preprint arXiv:1806.00582, 2018.
- H. Zhu, J. Xu, S. Liu, and Y. Jin, “Federated learning on non-iid data: A survey,” Neurocomputing, vol. 465, pp. 371–390, 2021.
- Q. Li, Y. Diao, Q. Chen, and B. He, “Federated learning on non-iid data silos: An experimental study,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 965–978.
- V. Kulkarni, M. Kulkarni, and A. Pant, “Survey of personalization techniques for federated learning,” in 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4). IEEE, 2020, pp. 794–797.
- P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al., “Advances and open problems in federated learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
- D. J. MacKay, “Bayesian neural networks and density networks,” Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 354, no. 1, pp. 73–80, 1995.
- M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient langevin dynamics,” in Proceedings of the 28th international conference on machine learning (ICML-11), 2011, pp. 681–688.
- R. Zhang, C. Li, J. Zhang, C. Chen, and A. G. Wilson, “Cyclical stochastic gradient mcmc for bayesian deep learning,” International Conference on Learning Representations, 2020.
- K. El Mekkaoui, D. Mesquita, P. Blomstedt, and S. Kaski, “Federated stochastic gradient langevin dynamics,” in Uncertainty in Artificial Intelligence. PMLR, 2021, pp. 1703–1712.
- F. Wenzel, K. Roth, B. S. Veeling, J. ’Swikatkowski, L. Tran, S. Mandt, J. Snoek, T. Salimans, R. Jenatton, and S. Nowozin, “How good is the bayes posterior in deep neural networks really?” in Proceedings of the 37th International Conference on Machine Learning, 2020, pp. 10 248–10 259.
- A. Graves, “Practical variational inference for neural networks,” Advances in neural information processing systems, vol. 24, 2011.
- C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural network,” in International conference on machine learning. PMLR, 2015, pp. 1613–1622.
- M. Dusenberry, G. Jerfel, Y. Wen, Y. Ma, J. Snoek, K. Heller, B. Lakshminarayanan, and D. Tran, “Efficient and scalable bayesian neural nets with rank-1 factors,” in International conference on machine learning. PMLR, 2020, pp. 2782–2792.
- W. J. Maddox, P. Izmailov, T. Garipov, D. P. Vetrov, and A. G. Wilson, “A simple baseline for bayesian uncertainty in deep learning,” Advances in neural information processing systems, vol. 32, 2019.
- P. Izmailov, W. J. Maddox, P. Kirichenko, T. Garipov, D. Vetrov, and A. G. Wilson, “Subspace inference for bayesian deep learning,” in Uncertainty in Artificial Intelligence. PMLR, 2020, pp. 1169–1179.
- E. Daxberger, A. Kristiadi, A. Immer, R. Eschenhagen, M. Bauer, and P. Hennig, “Laplace redux-effortless bayesian deep learning,” Advances in Neural Information Processing Systems, vol. 34, pp. 20 089–20 103, 2021.
- P. P. Liang, T. Liu, L. Ziyin, N. B. Allen, R. P. Auerbach, D. Brent, R. Salakhutdinov, and L.-P. Morency, “Think locally, act globally: Federated learning with local and global representations,” arXiv preprint arXiv:2001.01523, 2020.
- M. G. Arivazhagan, V. Aggarwal, A. K. Singh, and S. Choudhary, “Federated learning with personalization layers,” arXiv preprint arXiv:1912.00818, 2019.
- J. Oh, S. Kim, and S.-Y. Yun, “Fedbabu: Toward enhanced representation for federated image classification,” International Conference on Learning Representations, 2022.
- T. Chen, E. Fox, and C. Guestrin, “Stochastic gradient hamiltonian monte carlo,” in International conference on machine learning. PMLR, 2014, pp. 1683–1691.
- Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in international conference on machine learning. PMLR, 2016, pp. 1050–1059.
- V. Plassier, M. Vono, A. Durmus, and E. Moulines, “Dg-lmc: a turn-key and scalable synchronous distributed mcmc algorithm via langevin monte carlo within gibbs,” in International Conference on Machine Learning. PMLR, 2021, pp. 8577–8587.
- W. Deng, Q. Zhang, Y.-A. Ma, Z. Song, and G. Lin, “On convergence of federated averaging langevin dynamics,” arXiv preprint arXiv:2112.05120, 2021.
- M. Vono, V. Plassier, A. Durmus, A. Dieuleveut, and E. Moulines, “Qlsd: Quantised langevin stochastic dynamics for bayesian federated learning,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2022, pp. 6459–6500.
- N. Kotelevskii, M. Vono, A. Durmus, and E. Moulines, “Fedpop: A bayesian approach for personalised federated learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 8687–8701, 2022.
- R. Kassab and O. Simeone, “Federated generalized bayesian learning via distributed stein variational gradient descent,” IEEE Transactions on Signal Processing, vol. 70, pp. 2180–2192, 2022.
- A. T. Thorgeirsson and F. Gauterin, “Probabilistic predictions with federated learning,” Entropy, vol. 23, no. 1, p. 41, 2020.
- L. Liu, F. Zheng, H. Chen, G.-J. Qi, H. Huang, and L. Shao, “A bayesian federated learning framework with online laplace approximation,” arXiv preprint arXiv:2102.01936, 2021.
- X. Zhang, Y. Li, W. Li, K. Guo, and Y. Shao, “Personalized federated learning via variational bayesian inference,” in International Conference on Machine Learning. PMLR, 2022, pp. 26 293–26 310.
- H.-Y. Chen and W.-L. Chao, “Fedbe: Making bayesian model ensemble applicable to federated learning,” in International Conference on Learning Representations, 2020.
- M. Al-Shedivat, J. Gillenwater, E. Xing, and A. Rostamizadeh, “Federated learning via posterior averaging: A new perspective and practical algorithms,” International Conference on Learning Representations, 2021.
- M. Yurochkin, M. Agarwal, S. Ghosh, K. Greenewald, N. Hoang, and Y. Khazaeni, “Bayesian nonparametric federated learning of neural networks,” in International Conference on Machine Learning. PMLR, 2019, pp. 7252–7261.
- H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, and Y. Khazaeni, “Federated learning with matched averaging,” International Conference on Learning Representations, 2020.
- I. Achituve, A. Shamsian, A. Navon, G. Chechik, and E. Fetaya, “Personalized federated learning with gaussian processes,” Advances in Neural Information Processing Systems, vol. 34, pp. 8392–8406, 2021.
- D. Hendrycks and K. Gimpel, “A baseline for detecting misclassified and out-of-distribution examples in neural networks,” arXiv preprint arXiv:1610.02136, 2016.
- S. Liang, Y. Li, and R. Srikant, “Enhancing the reliability of out-of-distribution image detection in neural networks,” arXiv preprint arXiv:1706.02690, 2017.
- K. Lee, K. Lee, H. Lee, and J. Shin, “A simple unified framework for detecting out-of-distribution samples and adversarial attacks,” Advances in neural information processing systems, vol. 31, 2018.
- W. Liu, X. Wang, J. Owens, and Y. Li, “Energy-based out-of-distribution detection,” Advances in neural information processing systems, vol. 33, pp. 21 464–21 475, 2020.
- D. Hendrycks, M. Mazeika, and T. Dietterich, “Deep anomaly detection with outlier exposure,” arXiv preprint arXiv:1812.04606, 2018.
- S. Mohseni, M. Pitale, J. Yadawa, and Z. Wang, “Self-supervised learning for generalizable out-of-distribution detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, 2020, pp. 5216–5223.
- J. Zhang, N. Inkawhich, Y. Chen, and H. Li, “Fine-grained out-of-distribution detection with mixup outlier exposure,” CoRR, no. abs/2106.03917, 2021.
- M. Grcić, P. Bevandić, and S. Šegvić, “Dense open-set recognition with synthetic outliers generated by real nvp,” arXiv preprint arXiv:2011.11094, 2020.
- P. Izmailov, S. Vikram, M. D. Hoffman, and A. G. G. Wilson, “What are bayesian neural network posteriors really like?” in International conference on machine learning. PMLR, 2021, pp. 4629–4640.
- Y. Tan, G. Long, L. Liu, T. Zhou, Q. Lu, J. Jiang, and C. Zhang, “Fedproto: Federated prototype learning across heterogeneous clients,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 8, 2022, pp. 8432–8440.
- C.-T. Liu, C.-Y. Wang, S.-Y. Chien, and S.-H. Lai, “Fedfr: Joint optimization federated framework for generic and personalized face recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 2, 2022, pp. 1656–1664.
- J. Xu, X. Tong, and S.-L. Huang, “Personalized federated learning with feature alignment and classifier collaboration,” International Conference on Learning Representations, 2023.
- A. Kristiadi, M. Hein, and P. Hennig, “Being bayesian, even just a bit, fixes overconfidence in relu networks,” in International conference on machine learning. PMLR, 2020, pp. 5436–5446.
- A. Kristiadi, R. Eschenhagen, and P. Hennig, “Posterior refinement improves sample efficiency in bayesian neural networks,” Advances in Neural Information Processing Systems, vol. 35, pp. 30 333–30 346, 2022.
- M. Hein, M. Andriushchenko, and J. Bitterwolf, “Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 41–50.
- S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” in International conference on machine learning. PMLR, 2020, pp. 5132–5143.