Fed-CO2: Cooperation of Online and Offline Models for Severe Data Heterogeneity in Federated Learning (2312.13923v2)
Abstract: Federated Learning (FL) has emerged as a promising distributed learning paradigm that enables multiple clients to learn a global model collaboratively without sharing their private data. However, the effectiveness of FL is highly dependent on the quality of the data that is being used for training. In particular, data heterogeneity issues, such as label distribution skew and feature skew, can significantly impact the performance of FL. Previous studies in FL have primarily focused on addressing label distribution skew data heterogeneity, while only a few recent works have made initial progress in tackling feature skew issues. Notably, these two forms of data heterogeneity have been studied separately and have not been well explored within a unified FL framework. To address this gap, we propose Fed-CO${2}$, a universal FL framework that handles both label distribution skew and feature skew within a \textbf{C}ooperation mechanism between the \textbf{O}nline and \textbf{O}ffline models. Specifically, the online model learns general knowledge that is shared among all clients, while the offline model is trained locally to learn the specialized knowledge of each individual client. To further enhance model cooperation in the presence of feature shifts, we design an intra-client knowledge transfer mechanism that reinforces mutual learning between the online and offline models, and an inter-client knowledge transfer mechanism to increase the models' domain generalization ability. Extensive experiments show that our Fed-CO${2}$ outperforms a wide range of existing personalized federated learning algorithms in terms of handling label distribution skew and feature skew, both individually and collectively. The empirical results are supported by our convergence analyses in a simplified setting.
- Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–19, 2019.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2:429–450, 2020.
- Personalized federated learning through local memorization. In International Conference on Machine Learning, pages 15070–15092. PMLR, 2022.
- Personalized federated learning using hypernetworks. In International Conference on Machine Learning, pages 9489–9502. PMLR, 2021.
- Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10713–10722, 2021.
- Partialfed: Cross-domain personalized federated learning via partial initialization. Advances in Neural Information Processing Systems, 34:23309–23320, 2021.
- Personalized federated learning with adaptive batchnorm for healthcare. IEEE Transactions on Big Data, 2022.
- Fedbn: Federated learning on non-iid features via local batch normalization. International Conference on Learning Representations, 2021.
- Federated learning with personalization layers. arXiv preprint arXiv:1912.00818, 2019.
- On bridging generic and personalized federated learning for image classification. International Conference on Learning Representations, 2022.
- FedTP: Federated Learning by Transformer Personalization. IEEE Transactions on Neural Networks and Learning Systems, pages 1–15, 2023.
- Federated evaluation of on-device personalization. arXiv preprint arXiv:1910.10252, 2019.
- Three approaches for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619, 2020.
- Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. Advances in Neural Information Processing Systems, 33:3557–3568, 2020.
- Adaptive gradient-based meta-learning methods. Advances in Neural Information Processing Systems, 32, 2019.
- Personalized federated learning with moreau envelopes. Advances in Neural Information Processing Systems, 33:21394–21405, 2020.
- Ditto: Fair and robust federated learning through personalization. In International Conference on Machine Learning, pages 6357–6368. PMLR, 2021.
- Exploiting shared representations for personalized federated learning. In International Conference on Machine Learning, pages 2089–2099. PMLR, 2021.
- Think locally, act globally: Federated learning with local and global representations. arXiv preprint arXiv:2001.01523, 2020.
- Cd2-pfed: Cyclic distillation-guided channel decoupling for model personalization in federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10041–10050, 2022.
- Fedmask: Joint computation and communication-efficient personalized federated learning via heterogeneous masking. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, pages 42–55, 2021.
- Hypernetworks. arXiv preprint arXiv:1609.09106, 2016.
- Layer-wised model aggregation for personalized federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10092–10101, 2022.
- Doubly contrastive representation learning for federated image recognition. Pattern Recognition, 139:109507, 2023.
- Fedproc: Prototypical contrastive federated learning on non-iid data. Future Generation Computer Systems, 143:93–104, 2023.
- Federated mutual learning. arXiv preprint arXiv:2006.16765, 2020.
- Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581, 2019.
- Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems, 33:2351–2363, 2020.
- Parameterized knowledge transfer for personalized federated learning. Advances in Neural Information Processing Systems, 34:10092–10104, 2021.
- The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-free Hyper-Knowledge Distillation. arXiv preprint arXiv:2301.08968, 2023.
- TrFedDis: Trusted Federated Disentangling Network for Non-IID Domain Feature. arXiv preprint arXiv:2301.12798, 2023.
- Collaborative optimization and aggregation for decentralized domain generalization and adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6484–6493, 2021.
- Neural tangent kernel: Convergence and generalization in neural networks. Advances in neural information processing systems, 31, 2018.
- Weight normalization: A simple reparameterization to accelerate training of deep neural networks. Advances in neural information processing systems, 29, 2016.
- Gradient descent provably optimizes over-parameterized neural networks. arXiv preprint arXiv:1810.02054, 2018.
- Optimization theory for relu neural networks trained with normalization layers. In International conference on machine learning, pages 2751–2760. PMLR, 2020.
- Adaptive federated optimization. arXiv preprint arXiv:2003.00295, 2020.
- Learning multiple layers of features from tiny images. 2009.
- Geodesic flow kernel for unsupervised domain adaptation. In 2012 IEEE conference on computer vision and pattern recognition, pages 2066–2073. IEEE, 2012.
- Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1406–1415, 2019.
- Reading digits in natural images with unsupervised feature learning. 2011.
- Jonathan J. Hull. A database for handwritten text recognition research. IEEE Transactions on pattern analysis and machine intelligence, 16(5):550–554, 1994.
- Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189. PMLR, 2015.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Federated learning based on dynamic regularization. arXiv preprint arXiv:2111.04263, 2021.
- Personalized federated learning with gaussian processes. Advances in Neural Information Processing Systems, 34:8392–8406, 2021.
- Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Zhongyi Cai (3 papers)
- Ye Shi (51 papers)
- Wei Huang (318 papers)
- Jingya Wang (68 papers)