Joint Device Scheduling and Resource Allocation for Latency Constrained Wireless Federated Learning (2007.07174v1)

Published 14 Jul 2020 in cs.IT, cs.LG, cs.NI, eess.SP, and math.IT

Abstract: In federated learning (FL), devices contribute to the global training by uploading their local model updates via wireless channels. Due to limited computation and communication resources, device scheduling is crucial to the convergence rate of FL. In this paper, we propose a joint device scheduling and resource allocation policy to maximize the model accuracy within a given total training time budget for latency constrained wireless FL. A lower bound on the reciprocal of the training performance loss, in terms of the number of training rounds and the number of scheduled devices per round, is derived. Based on the bound, the accuracy maximization problem is solved by decoupling it into two sub-problems. First, given the scheduled devices, the optimal bandwidth allocation suggests allocating more bandwidth to the devices with worse channel conditions or weaker computation capabilities. Then, a greedy device scheduling algorithm is introduced, which in each step selects the device consuming the least updating time obtained by the optimal bandwidth allocation, until the lower bound begins to increase, meaning that scheduling more devices will degrade the model accuracy. Experiments show that the proposed policy outperforms state-of-the-art scheduling policies under extensive settings of data distributions and cell radius.

View on arXiv

Authors (5)

Wenqi Shi (21 papers)
Sheng Zhou (186 papers)
Zhisheng Niu (97 papers)
Miao Jiang (12 papers)
Lu Geng (4 papers)

Citations (271)

View on Semantic Scholar

Summary

Joint Device Scheduling and Resource Allocation for Latency Constrained Wireless Federated Learning

The paper under review presents a detailed examination of optimizing device scheduling and resource allocation within the context of wireless Federated Learning (FL). Federated Learning is an emerging distributed machine learning framework in which end devices, rather than a centralized server, participate in training a global model by sharing model updates derived from their local data. This method is particularly prescient in addressing privacy concerns, data transfer costs, and bandwidth limitations inherent to centralized approaches.

Summary of Contributions

The authors propose a joint device scheduling and resource allocation strategy tailored to enhance model accuracy during a constrained training period. The focus is on determining the optimal number of device schedules and bandwidth allocations, such that robust convergence rates are achieved without breaching latency constraints. The paper is computationally grounded, with a reliance on developing a theoretical framework to predict and optimize the convergence rates of FL with respect to time.

The methodology involves decoupling the optimization into two distinct sub-problems: bandwidth allocation and device scheduling. The authors introduced a novel binary search algorithm to solve the bandwidth allocation problem and provide a greedy scheduling policy for selecting optimal devices to participate in FL rounds, based upon real-time data processing capabilities and channel conditions. This approach allows efficient real-time adaption to the dynamic wireless environments typical in actual FL applications.

Theoretical Contributions and Experimental Results

A significant contribution of this work is the derivation of a convergence bound explicitly accounting for device scheduling nuances. This bound is applied to quantify the trade-off between latency per communication round and the overall number of communication rounds required to achieve given accuracy benchmarks. This trade-off is crucial for optimizing FL since both parameters profoundly impact the algorithm's efficiency and final performance.

The experimentation involves extensive trials on widely known datasets such as MNIST and CIFAR-10, showing that the proposed scheduling policy outperforms existing state-of-the-art methodologies. Notably, the detailed analysis shows the proposed algorithm's adaptability to non-i.i.d. data distributions and different cell radii, testifying to its real-world applicability. The optimal number of devices scheduled per round was empirically found to increase with the non-i.i.d. nature of the local datasets, demonstrating the efficacy of the proposed frameworks over simple heuristic baselines.

Implications and Future Directions

The paper's insights have practical significance in that they provide a feasible way to implement FL over wireless networks, potentially paving the way for more widespread FL adoption. This could, for example, facilitate adaptable mobile applications and intelligent sensitivity analysis in IoT devices that require immediate processing without offloading sensitive data to the cloud.

As future work, the authors propose investigating the further aspects of optimizing batch sizes and the number of local updates in heterogeneous FL environments. Such research aims to fine-tune the balance between local computation and communication overhead, further minimizing total system latency and resource consumption.

This work offers a fundamental step forward in understanding the intricacies of decentralized, privacy-preserving model training over wireless channels, with the promising extension of enhancing applicability towards more diverse and even larger-scale distributed environments.

PDF Markdown

Related Papers

Find Related Papers