Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 37 tok/s
GPT-5 High 28 tok/s Pro
GPT-4o 110 tok/s
GPT OSS 120B 468 tok/s Pro
Kimi K2 236 tok/s Pro
2000 character limit reached

Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression (2204.04831v1)

Published 11 Apr 2022 in cs.LG, cs.AR, cs.DC, and cs.PF

Abstract: Sample-efficient machine learning (SEML) has been widely applied to find optimal latency and power tradeoffs for configurable computer systems. Instead of randomly sampling from the configuration space, SEML reduces the search cost by dramatically reducing the number of configurations that must be sampled to optimize system goals (e.g., low latency or energy). Nevertheless, SEML only reduces one component of cost -- the total number of samples collected -- but does not decrease the cost of collecting each sample. Critically, not all samples are equal; some take much longer to collect because they correspond to slow system configurations. This paper present Cello, a computer systems optimization framework that reduces sample collection costs -- especially those that come from the slowest configurations. The key insight is to predict ahead of time whether samples will have poor system behavior (e.g., long latency or high energy) and terminate these samples early before their measured system behavior surpasses the termination threshold, which we call it predictive early termination. To predict the future system behavior accurately before it manifests as high runtime or energy, Cello uses censored regression to produces accurate predictions for running samples. We evaluate Cello by optimizing latency and energy for Apache Spark workloads. We give Cello a fixed amount of time to search a combined space of hardware and software configuration parameters. Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1.19X for minimizing latency under a power constraint, and improves energy by 1.18X for minimizing energy under a latency constraint.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.