- The paper reveals that production serverless workloads exhibit dramatic invocation variations across two distinct Huawei Cloud traces.
- It employs granular per-second and per-minute traces to analyze diurnal patterns, resource utilization, and cold-start latencies.
- Findings suggest opportunities for optimizing resource scheduling and improving predictive workload forecasting models.
Analysis of Long-term Trends in Serverless Workloads on Huawei Cloud
The paper presented in "How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads" offers a detailed examination of serverless workloads within Huawei Cloud's environments, leveraging data from both internal and public-facing platforms. Spanning over a period exceeding seven months, the data consists of over 1.4 trillion function invocations—providing significant insights into production Function-as-a-Service (FaaS) systems.
Overview and Dataset Description
The paper introduces two distinct traces from Huawei's infrastructures:
- Huawei Private Trace: Includes per-second statistics across multiple data centers for Huawei’s internal workloads. This trace details over 200 functions over 234 days, offering granular data on invocations, execution times, and system resource usage.
- Huawei Public Trace: Reflects data from a public FaaS platform with over 5000 functions operating within a single data center. This dataset is less granular but provides per-minute invocation counts over 26 days.
This analysis involves calculating statistical features such as request arrival rates, execution delays, cold-start influences, resource utilization, and periodic trends potentially useful for optimizing resource scheduling and autoscaling strategies in serverless environments.
Key Findings
Function Invocation Behaviors
- The disparity in function invocations is dramatic, with requests varying by up to nine orders of magnitude. Notably, some functions are invoked over a billion times per day.
- The public traces resemble similar datasets from Azure, pointing towards intrinsic trends across cloud platforms.
Periodicity and Ranking
- Strong daily periodicity pervades most functions, reflecting distinct diurnal patterns whereby functions typically correlate with human activites or nightly batch processing.
- Functions demonstrate variable invocation patterns, with the popularity rankings of functions showing relatively minor oscillations over extended periods.
Resource Consumption Patterns
- CPU and memory utilization figures frequently fall below the user-defined limits, indicating a significant opportunity for resource optimization through overcommitted resource scheduling.
Cold-start Latencies
- Cold-start latencies, heavily influenced by package sizes and runtime environments, showcase long tails in their distribution. This highlights the necessity to reduce or mask cold-start durations in highly dynamic and demand-variable environments.
Implications and Future Prospects
The paper not only underlines the heterogeneity and dynamism in serverless workloads but also accentuates the need for advanced forecasting models capable of managing fine-grained, long-term workload predictions. The current forecasting methods explored, including Time Series models like TimesNet and N-HiTS, showcase limitations in capturing precise short-term fluctuations while sufficiently accommodating long-range trends.
Areas of Research and Development
- Resource Scheduling: The identification of correlated function bursts and periodic trends suggest potential for more efficient resource utilization through predictive scheduling.
- Cold-start Optimization: Addressing cold-start latencies remains paramount, particularly through innovations in pre-warming strategies based on predictive demand analytics.
- Enhanced Predictive Modeling: The shortcomings in current forecasting models indicate fertile grounds for the development of tailored prediction algorithms that can sustain the dual scale of granularity and span.
- Global Univariate Time Series Models: Given the findings, these models hold promise in optimizing forecasting efficacy while reducing computational overheads typical with multivariate approaches.
Conclusion
The paper successfully highlights the considerable variability in serverless function behavior across Huawei Cloud's infrastructures and provides meaningful guidance on managing serverless workloads more effectively. The findings advocate for the evolution of cloud resource management practices, aligning them closely with the observed operational nuances of serverless environments. While incremental advancements in machine learning techniques for forecasting fine-grained data are warranted, broader reconsideration of serverless architecture and cloud scheduling paradigms could expedite improvements in resource efficiency and application performance.