Finishing Flows Quickly with Preemptive Scheduling (1206.2057v2)

Published 10 Jun 2012 in cs.NI

Abstract: Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements. We propose Preemptive Distributed Quick (PDQ) flow scheduling, a protocol designed to complete flows quickly and meet flow deadlines. PDQ enables flow preemption to approximate a range of scheduling disciplines. For example, PDQ can emulate a shortest job first algorithm to give priority to the short flows by pausing the contending flows. PDQ borrows ideas from centralized scheduling disciplines and implements them in a fully distributed manner, making it scalable to today's data centers. Further, we develop a multipath version of PDQ to exploit path diversity. Through extensive packet-level and flow-level simulation, we demonstrate that PDQ significantly outperforms TCP, RCP and D3 in data center environments. We further show that PDQ is stable, resilient to packet loss, and preserves nearly all its performance gains even given inaccurate flow information.

Citations (533)

View on Semantic Scholar

Summary

The paper introduces PDQ, a preemptive, distributed scheduling protocol that reduces average flow completion times by about 30%.
It employs scheduling disciplines like SJF and EDF to dynamically prioritize critical flows, enabling up to three times better deadline adherence than traditional protocols.
An extension, M-PDQ, uses multipath routing to balance loads and boost network efficiency in data center environments.

Insightful Overview of "Finishing Flows Quickly with Preemptive Scheduling"

The paper "Finishing Flows Quickly with Preemptive Scheduling" introduces Preemptive Distributed Quick (PDQ) flow scheduling, a protocol aimed at achieving low latency in data center networks by rapidly completing flows and meeting flow deadlines. This paper surfaces in the context of data center applications that demand strict latency requirements, which current congestion control protocols such as TCP and RCP fail to meet effectively due to their adherence to fair sharing.

Key Contributions and Findings

The authors propose PDQ, a distributed flow scheduling protocol that emulates scheduling disciplines like Shortest Job First (SJF) and Earliest Deadline First (EDF) to optimize flow completion times and deadline adherence. PDQ employs preemptive scheduling allowing critical flows to supersede less critical ones, thus dynamically reallocating bandwidth to achieve performance goals.

Simulation Results

Through detailed simulations, PDQ demonstrates superiority over existing protocols including TCP, RCP, and D $^3$ . Noteworthy results include:

Flow Completion Times: PDQ reduces average flow completion times by approximately 30% compared to the traditional protocols.
Deadline Adherence: It supports up to three times more concurrent flows while satisfying deadlines, surpassing D $^3$ significantly.
Stability and Resilience: PDQ remains stable under packet loss and performs well despite inaccuracies in flow information.

Multipath PDQ (M-PDQ)

An extension, M-PDQ, leverages multiple network paths for increased reliability and network utilization. This variant further decreases flow completion times by balancing loads across multiple paths, especially under sparse network loads.

Design Innovations

The design of PDQ is remarkable for its distributed approach, overcoming the limitations of centralized scheduling in large-scale data centers. Key features include:

Flow Prioritization Using FIFO Queues: It uses simple FIFO tail-drop queues, avoiding complex priority queue implementations.
Early Start and Termination: These techniques minimize idle periods and proactively terminate flows that are unlikely to meet deadlines, enhancing network efficiency.
Suppressed Probing: This technique optimizes resource usage by reducing the probing rate of paused flows based on their expected waiting time.

Implications and Future Directions

The findings have significant implications both practically and theoretically. Practically, PDQ’s efficient bandwidth allocation mechanics could lead to improved application performance in latency-sensitive environments. Theoretically, the success of PDQ’s distributed, preemptive approach may inspire future work on decentralized scheduling algorithms in data networks.

Looking ahead, this research opens avenues for integration with multipath protocols and explores potential enhancements in flow size estimation inaccuracies. It also raises considerations about fairness and possible gaming of the system by users, highlighting areas for further scrutiny and innovation.

In conclusion, PDQ represents a substantial contribution to the field of network protocol design, addressing critical challenges in achieving low-latency and deadline-sensitive operation within modern data centers.

PDF Markdown