Device Scheduling with Fast Convergence for Wireless Federated Learning (1911.00856v1)

Published 3 Nov 2019 in cs.NI, cs.IT, cs.LG, and math.IT

Abstract: Owing to the increasing need for massive data analysis and model training at the network edge, as well as the rising concerns about the data privacy, a new distributed training framework called federated learning (FL) has emerged. In each iteration of FL (called round), the edge devices update local models based on their own data and contribute to the global training by uploading the model updates via wireless channels. Due to the limited spectrum resources, only a portion of the devices can be scheduled in each round. While most of the existing work on scheduling focuses on the convergence of FL w.r.t. rounds, the convergence performance under a total training time budget is not yet explored. In this paper, a joint bandwidth allocation and scheduling problem is formulated to capture the long-term convergence performance of FL, and is solved by being decoupled into two sub-problems. For the bandwidth allocation sub-problem, the derived optimal solution suggests to allocate more bandwidth to the devices with worse channel conditions or weaker computation capabilities. For the device scheduling sub-problem, by revealing the trade-off between the number of rounds required to attain a certain model accuracy and the latency per round, a greedy policy is inspired, that continuously selects the device that consumes the least time in model updating until achieving a good trade-off between the learning efficiency and latency per round. The experiments show that the proposed policy outperforms other state-of-the-art scheduling policies, with the best achievable model accuracy under training time budgets.

Authors (3)

Wenqi Shi (21 papers)
Sheng Zhou (186 papers)
Zhisheng Niu (97 papers)

Citations (163)

View on Semantic Scholar

Summary

Device Scheduling with Fast Convergence for Wireless Federated Learning

The paper "Device Scheduling with Fast Convergence for Wireless Federated Learning" presents a comprehensive analysis of federated learning (FL) in wireless networks, focusing primarily on optimizing device scheduling to enhance convergence rates within constrained training time budgets. Federated learning emerges as a crucial framework that allows distributed model training leveraging local data on edge devices, addressing privacy concerns and excessive transmission costs associated with centralized approaches.

Contributions

The authors address a fundamental limitation in existing FL scheduling approaches, which typically focus on convergence with respect to iterative rounds rather than total training time. They propose a joint optimization model for bandwidth allocation and device scheduling, aimed at maximizing convergence speed within a given time frame. This model is pivotal for real-world FL applications in dynamic wireless environments where latency and bandwidth constraints are prevalent.

Key Findings

Bandwidth Allocation Strategy: The paper derives an optimal bandwidth allocation strategy that prioritizes devices with poorer channel conditions or weaker computational capabilities, enhancing overall model updating efficiency.
Device Scheduling Algorithm: A greedy scheduling policy is developed, which iteratively selects devices that minimize round latency while balancing learning efficiency and time consumption. This approach demonstrates superior model accuracy compared to state-of-the-art scheduling methods in various simulated scenarios.
Effectiveness with Non-IID Data: Through empirical evaluations using the MNIST dataset under both IID and non-IID conditions, the paper showcases that scheduling more devices per round notably improves convergence rates for highly non-IID distributed data. Regression analysis is provided to quantify the required rounds to achieve specific model accuracies under varying data distributions.

Implications

The findings have direct implications for the deployment and optimization of FL systems in mobile and wireless networks. By optimizing device scheduling and bandwidth allocation, edge computing can more effectively harness distributed data for AI applications without jeopardizing user privacy. The proposed methods offer substantial potential for enhancing resource allocation in vehicular networks, mobile apps, and other edge-computing platforms.

Future Directions

The paper hints at future research possibilities including exploring heterogeneous device configurations and movement patterns, further integration of analog aggregation techniques, and adaptation to varying wireless channel conditions. These advancements could refine FL implementations, making them more robust in fluctuating network environments.

Overall, the paper effectively demonstrates how intelligent scheduling and resource allocation can significantly impact the efficacy of federated learning systems, paving the way for more efficient and realistic AI model training in wireless networks.

Related Papers

Find Related Papers