Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 87 tok/s

Gemini 2.5 Pro 51 tok/s Pro

GPT-5 Medium 17 tok/s Pro

GPT-5 High 23 tok/s Pro

GPT-4o 102 tok/s Pro

Kimi K2 166 tok/s Pro

GPT OSS 120B 436 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Efficient Serverless Cold Start: Reducing Library Loading Overhead by Profile-guided Optimization (2504.19283v1)

Published 27 Apr 2025 in cs.DC and cs.PF

Abstract: Serverless computing abstracts away server management, enabling automatic scaling, efficient resource utilization, and cost-effective pricing models. However, despite these advantages, it faces the significant challenge of cold-start latency, adversely impacting end-to-end performance. Our study shows that many serverless functions initialize libraries that are rarely or never used under typical workloads, thus introducing unnecessary overhead. Although existing static analysis techniques can identify unreachable libraries, they fail to address workload-dependent inefficiencies, resulting in limited performance improvements. To overcome these limitations, we present SLIMSTART, a profile-guided optimization tool designed to identify and mitigate inefficient library usage patterns in serverless applications. By leveraging statistical sampling and call-path profiling, SLIMSTART collects runtime library usage data, generates detailed optimization reports, and applies automated code transformations to reduce cold-start overhead. Furthermore, SLIMSTART integrates seamlessly into CI/CD pipelines, enabling adaptive monitoring and continuous optimizations tailored to evolving workloads. Through extensive evaluation across three benchmark suites and four real-world serverless applications, SLIMSTART achieves up to a 2.30X speedup in initialization latency, a 2.26X improvement in end-to-end latency, and a 1.51X reduction in memory usage, demonstrating its effectiveness in addressing cold-start inefficiencies and optimizing resource utilization.