Optimizing the Communication-Accuracy Trade-off in Federated Learning with Rate-Distortion Theory (2201.02664v3)

Published 7 Jan 2022 in cs.LG, cs.DC, cs.IT, math.IT, and stat.ML

Abstract: A significant bottleneck in federated learning (FL) is the network communication cost of sending model updates from client devices to the central server. We present a comprehensive empirical study of the statistics of model updates in FL, as well as the role and benefits of various compression techniques. Motivated by these observations, we propose a novel method to reduce the average communication cost, which is near-optimal in many use cases, and outperforms Top-K, DRIVE, 3LC and QSGD on Stack Overflow next-word prediction, a realistic and challenging FL benchmark. This is achieved by examining the problem using rate-distortion theory, and proposing distortion as a reliable proxy for model accuracy. Distortion can be more effectively used for optimizing the trade-off between model performance and communication cost across clients. We demonstrate empirically that in spite of the non-i.i.d. nature of federated learning, the rate-distortion frontier is consistent across datasets, optimizers, clients and training rounds.

Authors (4)

Nicole Mitchell (7 papers)
Johannes Ballé (29 papers)
Zachary Charles (33 papers)
Jakub Konečný (28 papers)

Citations (19)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Optimizing the Communication-Accuracy Trade-off in Federated Learning with Rate-Distortion Theory (2201.02664v3)

Summary

Related Papers