Think Locally, Act Globally: Federated Learning with Local and Global Representations (2001.01523v3)

Published 6 Jan 2020 in cs.LG, cs.DC, and stat.ML

Abstract: Federated learning is a method of training models on private data distributed over multiple devices. To keep device data private, the global model is trained by only communicating parameters and updates which poses scalability challenges for large models. To this end, we propose a new federated learning algorithm that jointly learns compact local representations on each device and a global model across all devices. As a result, the global model can be smaller since it only operates on local representations, reducing the number of communicated parameters. Theoretically, we provide a generalization analysis which shows that a combination of local and global models reduces both variance in the data as well as variance across device distributions. Empirically, we demonstrate that local models enable communication-efficient training while retaining performance. We also evaluate on the task of personalized mood prediction from real-world mobile data where privacy is key. Finally, local models handle heterogeneous data from new devices, and learn fair representations that obfuscate protected attributes such as race, age, and gender.

View on arXiv

Authors (8)

Paul Pu Liang (103 papers)
Terrance Liu (14 papers)
Liu Ziyin (38 papers)
Nicholas B. Allen (2 papers)
Randy P. Auerbach (2 papers)
David Brent (3 papers)
Ruslan Salakhutdinov (248 papers)
Louis-Philippe Morency (123 papers)

Citations (490)

View on Semantic Scholar

Summary

Overview of Federated Learning with Local and Global Representations

The paper "Think Locally, Act Globally: Federated Learning with Local and Global Representations" presents a novel federated learning algorithm termed LG-FedAvg, designed to address scalability challenges posed by large models in federated learning. The algorithm synergizes compact local representations unique to each device with a global model applicable across all devices, optimizing both training efficiency and communication costs.

Key Contributions

The authors propose a framework where local models independently extract lower-dimensional representations from device data. This strategy ensures that the global model deals only with these representations, effectively reducing the size of the model and the cost of parameter communication. The method is underpinned by a theoretical generalization analysis demonstrating that this dual-model approach mitigates variance both within data and across different device distributions.

Theoretical and Empirical Insights

Theoretical Generalization Analysis: The authors establish that the dual-model structure of LG-FedAvg balances the variance arising from heterogeneous data distributions and local device data variance. This analysis suggests a variance reduction approach superior to purely local or purely global models.
Communication Efficiency: Empirical results indicate that the proposed method significantly reduces the number of communicated parameters compared to traditional federated learning techniques. For instance, experiments on the MNIST and CIFAR-10 datasets show that LG-FedAvg can maintain or exceed performance levels of baseline methods while communicating roughly half the parameters.
Adaptation to Heterogeneous Data: The method is particularly effective in scenarios with non-i.i.d. data distributions across devices. The separation and optimization of local models enable LG-FedAvg to handle new data with different distributions, maintaining performance without incurring catastrophic forgetting common in global-only models.
Fairness in Model Training: The paper extends its approach to incorporate adversarial methods in training local models to obfuscate sensitive attributes such as race and gender. This feature is crucial for privacy-preserving applications and highlights the versatility and potential ethical improvements possible within this framework.

Potential Implications and Future Directions

The introduction of LG-FedAvg opens several research avenues, particularly in enhancing scalability and efficiency in federated learning. Future work could include integration with model compression techniques to further optimize communication overhead. Moreover, dynamic allocation of neural network layers between local and global components could allow for adaptive model tuning based on specific device or data requirements. The paper also sets a foundation for exploring privacy-preserving techniques in federated learning, especially through enhanced adversarial training methods.

In conclusion, the LG-FedAvg algorithm demonstrates a compelling approach to federated learning, addressing critical issues of scalability and communication efficiency while providing robust adaptability to heterogeneous data and fostering fairness in representation learning. The insights from this work suggest further exploration in the deployment of federated learning systems in real-world applications such as healthcare and data-sensitive environments.

PDF Markdown

Related Papers

Find Related Papers