Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization (2007.07481v1)

Published 15 Jul 2020 in cs.LG, cs.DC, and stat.ML

Abstract: In federated optimization, heterogeneity in the clients' local datasets and computation speeds results in large variations in the number of local updates performed by each client in each communication round. Naive weighted aggregation of such models causes objective inconsistency, that is, the global model converges to a stationary point of a mismatched objective function which can be arbitrarily different from the true objective. This paper provides a general framework to analyze the convergence of federated heterogeneous optimization algorithms. It subsumes previously proposed methods such as FedAvg and FedProx and provides the first principled understanding of the solution bias and the convergence slowdown due to objective inconsistency. Using insights from this analysis, we propose FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.

Authors (5)

Jianyu Wang (84 papers)
Qinghua Liu (33 papers)
Hao Liang (137 papers)
Gauri Joshi (73 papers)
H. Vincent Poor (884 papers)

Citations (1,124)

View on Semantic Scholar

Summary

The paper introduces a general convergence framework that reveals how naive weighted averaging in federated learning leads to a mismatched surrogate objective.
FedNova is proposed as a novel normalized averaging method that corrects disparate local update contributions, ensuring convergence to the true global objective.
The theoretical analysis and experiments on datasets like CIFAR-10 quantify the convergence slowdown and optimality gap caused by client heterogeneity.

Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization

The paper "Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization" by Jianyu Wang et al. addresses a significant challenge in Federated Learning (FL): the objective inconsistency resulting from variability in clients' local update counts. This issue arises due to disparities in local dataset sizes, computation speeds, and other environmental factors, causing naive weighted aggregation methods in FL to converge to a stationary point of a mismatched objective function rather than the true global objective.

Key Contributions

The authors make several notable contributions:

General Convergence Framework: The paper introduces a comprehensive framework to analyze the convergence of heterogeneous federated optimization algorithms, subsuming methods like FedAvg and FedProx. This framework provides insights into solution bias and convergence slowdown due to objective inconsistency.
FedNova Algorithm: FedNova is proposed as a novel normalized averaging method that eradicates objective inconsistency while maintaining rapid error convergence.

Heterogeneity and Its Implications

In federated learning, clients often have varied computational capacities and local data sizes, leading to differences in the number of local updates ( $\tau_i$ ). This heterogeneity can result in significant misalignment between the aggregated global model and the true objective function. The authors critique that most prior convergence analyses assume homogeneity in local updates, which is rarely the case in practical FL settings.

Objective Inconsistency

The core issue is that naive averaging of client models after heterogeneous updates leads to convergence not to the true objective $F(x)$ , but a surrogate objective $\widetilde{F(x)}$ , which can deviate arbitrarily from $F(x)$ . This is illustrated through a simple quadratic model, demonstrating that FedAvg can converge to a point that is far from the true global minimum depending on the relative local update counts.

Analysis Framework

The presented theoretical framework models the general update rule for heterogeneous federated optimization as: $x^{(t+1,0)} = x^{(t,0)} - \sum_{i=1}^m w_i \cdot d_i^{(t)}$ Here, $d_i^{(t)}$ is the normalized gradient, $w_i$ are the aggregation weights, and $\eta$ is a scaling parameter. This formulation allows for the inclusion of various local solvers (SGD, proximal SGD, momentum-based methods, etc.) and different numbers of local updates.

Numerical Results

The authors derive the upper bound on optimization error and decompose it into: $\min_{t \in [T]} \Exs \|\nabla (x^{(t,0)})\|^2 \leq \mathcal{O}\left(\frac{\sigma^2}{\sqrt{m \tau T}}\right) + \mathcal{O}\left(\frac{A}{\sqrt{m \tau T}}\right) + \mathcal{O}\left(\frac{mB}{T}\right) + \mathcal{O}\left(\frac{mC}{T}\right)$ This analysis quantifies the slowdown due to heterogeneity and provides sharper insights into the optimality gap.

FedNova

FedNova corrects the aggregated model update by normalizing local updates, ensuring the global model converges to the true objective function. This is achieved by setting aggregation weights $w_i = p_i$ and appropriately scaling the local updates. The theoretical analysis and numerical experiments on synthetic and real-world datasets (e.g., CIFAR-10) validate FedNova's superior performance in terms of accuracy and convergence speed compared to traditional methods like FedAvg and FedProx.

Practical and Theoretical Implications

Practical: FedNova significantly mitigates the issue of stragglers and allows fast clients to contribute more effective updates without waiting for slower ones, thereby improving overall training efficiency.
Theoretical: The framework provides a foundation for rigorously analyzing and understanding the convergence behavior of federated optimization algorithms under realistic heterogeneous conditions.

Future Directions

The work opens up multiple avenues for future research, such as extending the framework to adaptive optimization methods and gossip-based training models, potentially leading to further improvements in federated learning paradigms.

In conclusion, this paper provides a robust and insightful understanding of objective inconsistency in federated optimization and introduces FedNova as a compelling solution, laying the groundwork for advancements in heterogeneous federated learning.

PDF Markdown

Related Papers

Tweets

https://twitter.com/flwrlabs/status/1750996034652606650