Flexible Clustered Federated Learning for Client-Level Data Distribution Shift (2108.09749v1)

Published 22 Aug 2021 in cs.LG and cs.DC

Abstract: Federated Learning (FL) enables the multiple participating devices to collaboratively contribute to a global neural network model while keeping the training data locally. Unlike the centralized training setting, the non-IID, imbalanced (statistical heterogeneity) and distribution shifted training data of FL is distributed in the federated network, which will increase the divergences between the local models and the global model, further degrading performance. In this paper, we propose a flexible clustered federated learning (CFL) framework named FlexCFL, in which we 1) group the training of clients based on the similarities between the clients' optimization directions for lower training divergence; 2) implement an efficient newcomer device cold start mechanism for framework scalability and practicality; 3) flexibly migrate clients to meet the challenge of client-level data distribution shift. FlexCFL can achieve improvements by dividing joint optimization into groups of sub-optimization and can strike a balance between accuracy and communication efficiency in the distribution shift environment. The convergence and complexity are analyzed to demonstrate the efficiency of FlexCFL. We also evaluate FlexCFL on several open datasets and made comparisons with related CFL frameworks. The results show that FlexCFL can significantly improve absolute test accuracy by +10.6% on FEMNIST compared to FedAvg, +3.5% on FashionMNIST compared to FedProx, +8.4% on MNIST compared to FeSEM. The experiment results show that FlexCFL is also communication efficient in the distribution shift environment.

View on arXiv

Authors (7)

Moming Duan (9 papers)
Duo Liu (13 papers)
Xinyuan Ji (5 papers)
Yu Wu (196 papers)
Liang Liang (24 papers)
Xianzhang Chen (9 papers)
Yujuan Tan (9 papers)

Citations (96)

View on Semantic Scholar

Summary

Flexible Clustered Federated Learning for Client-Level Data Distribution Shift

The paper "Flexible Clustered Federated Learning for Client-Level Data Distribution Shift" introduces FlexCFL, a novel framework designed to enhance Federated Learning (FL) in environments characterized by non-IID data distribution across client devices. The main innovation of FlexCFL lies in its ability to group and manage client training dynamically, allowing the system to better accommodate statistical heterogeneity and shifts in data distribution at the client level.

Key Contributions

Clustered Grouping of Clients: FlexCFL proposes an efficient client grouping mechanism based on the Euclidean distance of decomposed cosine similarity (EDC). This approach improves upon traditional cosine similarity methods, particularly in handling high-dimensional data efficiently. The grouping allows for the separation of clients with diverse optimization directions, minimizing the divergence between local updates and the global model aggregated at the server.
Newcomer Device Integration: The framework includes a scalable approach to integrate new client devices, using a cold start mechanism that assigns newcomers to the most appropriate group based on a comparison of their initial optimization trajectory with pre-established group profiles.
Client Migration for Data Distribution Shifts: FlexCFL incorporates a mechanism to allow clients to migrate between groups if a significant data distribution shift is detected. The migration process leverages the Wasserstein distance to quantify distribution changes, thus maintaining or improving the model's accuracy when data characteristics at the client level change over time.
Inter-Group Aggregation: While initially each group operates independently, FlexCFL offers an option for controlled inter-group aggregation. This allows for some level of parameter sharing across groups to enhance learning, especially useful in scenarios where group models experience similar distribution shifts.

Experimental Results

FlexCFL demonstrates significant improvements over traditional FL algorithms like FedAvg and FedProx, as well as over other clustered FL frameworks like IFCA and FeSEM. On diverse datasets such as MNIST, FEMNIST, and FashionMNIST:

FlexCFL achieved up to a 40.9% improvement in test accuracy on datasets with high heterogeneity compared to traditional methods.
Under dynamic conditions with data distribution shifts, FlexCFL maintained or enhanced model performance while frameworks that lacked migration strategies saw more substantial degradations.
The inclusion of inter-group aggregation enabled FlexCFL to adapt better under certain distribution shifts, achieving higher accuracy with reasonable additional communication costs.

Implications and Future Directions

The implications of FlexCFL are substantial for applications requiring robust model performance under non-stationary data distributions, such as mobile and IoT devices where user data can vary significantly over time. The framework's flexibility in client management and dynamic model training suggests avenues for more sophisticated adaptive FL systems.

Future research could delve into optimizing inter-group communication, enhancing the scalability of client migrations, or exploring alternative measures of similarity and divergence for client clustering. Moreover, extending this framework's application to more complex models and domains could further validate and enhance its utility in diverse federated learning scenarios.

In conclusion, FlexCFL represents a significant advance in clustered federated learning by addressing the dual challenges of statistical heterogeneity and data distribution shift in a cohesive and practical manner.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - morningD/FlexCFL: FlexCFL: A clustered federated learning framework based on TF2.0. Support frameworks: FlexCFL, FedGroup, FedAvg, IFCA, FeSEM, et al. (48 stars)