Salvaging Federated Learning by Local Adaptation (2002.04758v3)

Published 12 Feb 2020 in cs.LG, cs.AI, cs.DC, and stat.ML

Abstract: Federated learning (FL) is a heavily promoted approach for training ML models on sensitive data, e.g., text typed by users on their smartphones. FL is expressly designed for training on data that are unbalanced and non-iid across the participants. To ensure privacy and integrity of the fedeated model, latest FL approaches use differential privacy or robust aggregation. We look at FL from the \emph{local} viewpoint of an individual participant and ask: (1) do participants have an incentive to participate in FL? (2) how can participants \emph{individually} improve the quality of their local models, without re-designing the FL framework and/or involving other participants? First, we show that on standard tasks such as next-word prediction, many participants gain no benefit from FL because the federated model is less accurate on their data than the models they can train locally on their own. Second, we show that differential privacy and robust aggregation make this problem worse by further destroying the accuracy of the federated model for many participants. Then, we evaluate three techniques for local adaptation of federated models: fine-tuning, multi-task learning, and knowledge distillation. We analyze where each is applicable and demonstrate that all participants benefit from local adaptation. Participants whose local models are poor obtain big accuracy improvements over conventional FL. Participants whose local models are better than the federated model\textemdash and who have no incentive to participate in FL today\textemdash improve less, but sufficiently to make the adapted federated model better than their local models.

View on arXiv

Authors (3)

Tao Yu (282 papers)
Eugene Bagdasaryan (17 papers)
Vitaly Shmatikov (42 papers)

Citations (237)

View on Semantic Scholar

Summary

Salvaging Federated Learning by Local Adaptation

The paper, "Salvaging Federated Learning by Local Adaptation," authored by Tao Yu, Eugene Bagdasaryan, and Vitaly Shmatikov, addresses key challenges in Federated Learning (FL), particularly how privacy and robustness constraints impact the accuracy of models for individual participants. Federated Learning is specifically designed to train models on non-iid data distributed across multiple clients, such as text input from smartphone users or medical records from different hospitals. While it preserves privacy by keeping data local, the implementation of mechanisms like differential privacy and robust aggregation may degrade the utility of the resulting global model for individual clients.

Key Findings

Inefficacy of FL Models for Some Participants: The authors demonstrate that in tasks such as next-word prediction, many participants do not benefit from participating in FL because the federated model's accuracy is inferior to locally-trained models. This situation is exacerbated when differential privacy and robust aggregation techniques are used. Specifically, in a Reddit-based word prediction task, the robust median aggregation model (ROBUST-FED) was found to be less accurate than local models for about 52.15% of participants.
Local Adaptation Techniques: The authors explore techniques such as fine-tuning, multi-task learning (MTL), and knowledge distillation for locally adapting the federated model to individual data distributions. Their analysis indicates that local adaptation generally enhances model accuracy for participants. For instance, those who initially had little incentive to engage in FL because their local models performed better experienced significant improvements through local adaptation, making the federated model advantageous.
Impact of Adaptation: The adaptation strategies recover the accuracy losses induced by privacy and robustness constraints. For example, in image classification tasks, adapted models show a recovery and enhancement of accuracy lost due to differential privacy (a 7.83% loss) and robust aggregation (an 11.89% loss).

Implications

Practical Implications: Local adaptations in FL provide a compelling means to encourage broader participation by ensuring that models meet individual users' needs more effectively. Fine-tuning, for example, allows leveraging the pre-trained federated model's feature extraction capabilities particular to each participant's data, thus making FL viable across diverse datasets and applications.

Theoretical Implications: From a theoretical standpoint, this paper underscores an essential trade-off in FL between maintaining participant privacy and achieving optimal individual accuracy. The research points towards a balanced approach where model aggregation strategies incorporate participant-specific adaptations without structural changes to FL frameworks, which are typically controlled by overarching platform operators.

Future Directions

While local adaptation techniques indeed improve individual model performance, the paper opens avenues for future research. Potential areas include optimizing the trade-off between adaptation complexity and computational cost, exploring semi-universal models that further harmonize global accuracy with individual performance, and fine-tuning aggregation strategies for robustness beyond simple median and mean combinations.

In conclusion, the work by Yu, Bagdasaryan, and Shmatikov provides crucial insights into the enhancement of Federated Learning through participant-specific model adaptations, highlighting a pathway to reconcile privacy, robustness, and accuracy in distributed machine learning environments.

PDF Markdown

Related Papers

Find Related Papers