- The paper introduces FedMix, a novel algorithm approximating Mixup under MAFL to tackle non-iid data in federated learning.
- It leverages averaged local data sharing to preserve privacy while approximating effective data augmentation.
- Empirical evaluations on benchmark datasets show that FedMix significantly outperforms traditional federated learning methods.
"FedMix: Approximation of Mixup under Mean Augmented Federated Learning" addresses the challenges posed by non-iid (non-independent and identically distributed) data in federated learning (FL). Federated learning enables multiple edge devices to collaboratively train machine learning models without sharing their local data, preserving data privacy and eliminating the need for centralized data storage. However, one of the critical issues in FL is that performance can significantly degrade when local data across clients is heterogeneous.
The authors propose a framework called Mean Augmented Federated Learning (MAFL) to mitigate this issue. The key idea behind MAFL is to allow clients to share and receive averaged local data while still adhering to privacy constraints. This method posits that by approximating the data distribution across different clients, it is possible to reduce the negative impact of data heterogeneity.
Building upon the MAFL framework, the paper introduces FedMix, an innovative data augmentation algorithm. FedMix draws inspiration from Mixup, a well-known and effective data augmentation technique that traditionally requires mixing raw data samples from different sources. However, to ensure that raw data is not directly shared among clients, FedMix approximates Mixup by using the averaged data exchanged under the MAFL framework. This approach retains the augmentation benefits of Mixup without compromising privacy.
The empirical evaluations demonstrate that FedMix significantly outperforms existing federated learning algorithms, particularly in settings where data across clients is highly non-iid. The standard benchmark datasets used in the paper show that FedMix achieves better model performance, making it a promising solution for practical applications of federated learning where data heterogeneity is a prevalent issue.
In summary, through MAFL and the FedMix algorithm, this paper provides an effective approach to improving federated learning under non-iid conditions, maintaining data privacy, and enhancing overall model performance.