Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Weld: Rethinking the Interface Between Data-Intensive Applications (1709.06416v2)

Published 14 Sep 2017 in cs.DC, cs.DB, and cs.PF

Abstract: Data analytics applications combine multiple functions from different libraries and frameworks. Even when each function is optimized in isolation, the performance of the combined application can be an order of magnitude below hardware limits due to extensive data movement across these functions. To address this problem, we propose Weld, a new interface between data-intensive libraries that can optimize across disjoint libraries and functions. Weld exposes a lazily-evaluated API where diverse functions can submit their computations in a simple but general intermediate representation that captures their data-parallel structure. It then optimizes data movement across these functions and emits efficient code for diverse hardware. Weld can be integrated into existing frameworks such as Spark, TensorFlow, Pandas and NumPy without changing their user-facing APIs. We demonstrate that Weld can speed up applications using these frameworks by up to 29x.

Citations (21)

Summary

We haven't generated a summary for this paper yet.