Stochastic Controlled Averaging for Federated Learning with Communication Compression (2308.08165v2)

Published 16 Aug 2023 in math.OC, cs.DC, cs.LG, and stat.ML

Abstract: Communication compression, a technique aiming to reduce the information volume to be transmitted over the air, has gained great interests in Federated Learning (FL) for the potential of alleviating its communication overhead. However, communication compression brings forth new challenges in FL due to the interplay of compression-incurred information distortion and inherent characteristics of FL such as partial participation and data heterogeneity. Despite the recent development, the performance of compressed FL approaches has not been fully exploited. The existing approaches either cannot accommodate arbitrary data heterogeneity or partial participation, or require stringent conditions on compression. In this paper, we revisit the seminal stochastic controlled averaging method by proposing an equivalent but more efficient/simplified formulation with halved uplink communication costs. Building upon this implementation, we propose two compressed FL algorithms, SCALLION and SCAFCOM, to support unbiased and biased compression, respectively. Both the proposed methods outperform the existing compressed FL methods in terms of communication and computation complexities. Moreover, SCALLION and SCAFCOM accommodates arbitrary data heterogeneity and do not make any additional assumptions on compression errors. Experiments show that SCALLION and SCAFCOM can match the performance of corresponding full-precision FL approaches with substantially reduced uplink communication, and outperform recent compressed FL methods under the same communication budget.

References (72)

Citations (170)

View on Semantic Scholar

Summary

Stochastic Controlled Averaging for Federated Learning with Communication Compression

Federated Learning (FL) has emerged as a powerful paradigm for training machine learning models across decentralized data sources such as mobile devices and remote sensors. This method promotes data privacy by transmitting model updates from local devices to a central server instead of raw data. Despite its advantages, FL faces challenges due to communication overhead, data heterogeneity, and partial client participation. Communication compression, aimed at alleviating these issues, introduces additional complexities due to information distortion.

The paper, "Stochastic Controlled Averaging for Federated Learning with Communication Compression," revisits a seminal approach in FL by presenting a more communication-efficient variant. It proposes two algorithms designed to accommodate unbiased and biased compression methods.

Key Contributions

Simplified Controlled Averaging: The authors introduce a refined version of stochastic controlled averaging that significantly reduces uplink communication by half. Instead of sending both local model and control variable updates, the formulation allows each client to transmit only one compressed variable per round.
Robust to Data Heterogeneity and Partial Participation: Both proposed algorithms accommodate arbitrary data heterogeneity without imposing additional assumptions and demonstrate strong empirical performance even under stringent client participation scenarios.
Superior Convergence Rates: The algorithms, namely SCALLION and SCAFCOM, are supported by rigorous theoretical analyses showing state-of-the-art convergence rates comparable to full-precision counterparts.

Numerical Results and Implications

Experiments conducted with widely-used datasets (MNIST, Fashion MNIST) and various compression settings (biased and unbiased) underscore the effectiveness of SCALLION and SCAFCOM. They reveal that the proposed methods achieve results close to full-precision FL but with significantly reduced communication costs — up to 100x compression in certain setups.

Experimental Validation: With proper tuning, SCALLION and SCAFCOM match or outperform existing compressed FL methods under identical constraints. Their robustness to client drift and communication distortion positions them as practical solutions for real-world deployments.
Compression Efficiency: By reducing uplink communication without sacrificing accuracy, these techniques offer feasible pathways to scale FL across larger environments or more resource-constrained networks.

Future Prospects

This research invites further exploration into adaptive compression schemes, privacy-preserving protocols, and hybrid models that might integrate FL with other distributed learning frameworks. Given the ever-increasing demand for privacy and resource-efficient machine learning solutions, these contributions present promising avenues for future advancements in AI systems.

In conclusion, the paper advances federated learning by mitigating critical bottlenecks in communication and client variability, laying the groundwork for more efficient and robust machine learning systems distributed across diverse and decentralized environments.

Tweets

https://twitter.com/StatMLPapers/status/1777910318796009854