Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing (2010.07268v1)

Published 14 Oct 2020 in cs.DC

Abstract: Serverless computing is increasingly being used for parallel computing, which have traditionally been implemented as stateful applications. Executing complex, burst-parallel, directed acyclic graph (DAG) jobs poses a major challenge for serverless execution frameworks, which will need to rapidly scale and schedule tasks at high throughput, while minimizing data movement across tasks. We demonstrate that, for serverless parallel computations, decentralized scheduling enables scheduling to be distributed across Lambda executors that can schedule tasks in parallel, and brings multiple benefits, including enhanced data locality, reduced network I/Os, automatic resource elasticity, and improved cost effectiveness. We describe the implementation and deployment of our new serverless parallel framework, called Wukong, on AWS Lambda. We show that Wukong achieves near-ideal scalability, executes parallel computation jobs up to 68.17x faster, reduces network I/O by multiple orders of magnitude, and achieves 92.96% tenant-side cost savings compared to numpywren.

Citations (106)

Summary

  • The paper presents a novel serverless framework using decentralized scheduling to enhance task execution and data locality for complex DAG jobs.
  • It demonstrates near-ideal scalability by executing jobs up to 68.17× faster and significantly reducing network I/O compared to current systems.
  • Optimization strategies like task clustering and delayed I/O lead to up to 92.96% cost savings and improved overall resource efficiency.

A Review of "Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing"

"Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing" presents a novel approach to addressing the limitations of current serverless execution frameworks when handling complex, burst-parallel, directed acyclic graph (DAG) jobs. The paper introduces Wukong, a serverless parallel computing framework that leverages decentralized scheduling to enhance data locality, reduce network I/Os, and achieve automatic resource elasticity.

The authors identify multiple challenges inherent to executing DAG jobs in serverless environments, such as the absence of efficient task scheduling and the problem of excessive data movement due to suboptimal data locality. Unlike traditional serverful frameworks presided over by centralized schedulers, Wukong employs decentralized scheduling, distributing the workload across numerous Lambda executors. This approach optimizes the scheduling process by allowing tasks to be scheduled in parallel and ensures that tasks are executed closer to their data sources, effectively reducing both resource contention and network I/O overhead.

The paper demonstrates significant performance improvements with Wukong by conducting extensive evaluations on AWS Lambda. The results indicate that Wukong achieves near-ideal scalability, executing parallel jobs up to 68.17× faster and reducing network I/O by many orders of magnitude when compared to numpywren, a state-of-the-art system for linear algebra in serverless settings. Furthermore, Wukong exhibits substantial cost savings, achieving up to 92.96% tenant-side cost reductions.

A key highlight of the Wukong framework is its use of the optimization techniques, such as decentralized scheduling, task clustering, and delayed I/O, to enhance data locality and resource efficiency. These techniques allow Wukong to make informed scheduling decisions that prioritize the execution of tasks with satisfied dependencies, thereby minimizing unnecessary data transfer between tasks. Additionally, Wukong's ability to dynamically adjust the level of parallelism by partitioning and clustering tasks according to their data dependencies ensures efficient use of serverless resources.

The paper's contributions imply significant practical implications for serverless computing. By leveraging Wukong, developers can expect more efficient execution of complex, data-intensive applications, opening up opportunities for cost-effective and scalable solutions in domains like data analytics and real-time machine learning. Theoretically, the research paves the way for future exploration into improving the performance, scalability, and cost-effectiveness of serverless execution models.

In summary, this paper effectively confronts the limitations of typical serverless frameworks with its introduction of Wukong, showcasing a robust improvement in task scheduling and data movement reduction. As serverless computing continues to gain traction, frameworks like Wukong will be instrumental in unlocking the full potential of Function as a Service (FaaS) in handling complex parallel computations. Future research could expand upon these findings by exploring more advanced optimization strategies and extending Wukong's applicability to a broader range of serverless platforms.