Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

157 tokens/sec

GPT-4o

43 tokens/sec

Gemini 2.5 Pro Pro

43 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

47 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

16 869

Evaluating Search-Based Software Microbenchmark Prioritization (2211.13525v4)

Published 24 Nov 2022 in cs.SE

Abstract: Ensuring that software performance does not degrade after a code change is paramount. A solution is to regularly execute software microbenchmarks, a performance testing technique similar to (functional) unit tests, which, however, often becomes infeasible due to extensive runtimes. To address that challenge, research has investigated regression testing techniques, such as test case prioritization (TCP), which reorder the execution within a microbenchmark suite to detect larger performance changes sooner. Such techniques are either designed for unit tests and perform sub-par on microbenchmarks or require complex performance models, drastically reducing their potential application. In this paper, we empirically evaluate single- and multi-objective search-based microbenchmark prioritization techniques to understand whether they are more effective and efficient than greedy, coverage-based techniques. For this, we devise three search objectives, i.e., coverage to maximize, coverage overlap to minimize, and historical performance change detection to maximize. We find that search algorithms (SAs) are only competitive with but do not outperform the best greedy, coverage-based baselines. However, a simple greedy technique utilizing solely the performance change history (without coverage information) is equally or more effective than the best coverage-based techniques while being considerably more efficient, with a runtime overhead of less than 1%. These results show that simple, non-coverage-based techniques are a better fit for microbenchmarks than complex coverage-based techniques.

References (85)

Citations (2)

View on Semantic Scholar

Summary

The paper demonstrates that search-based genetic algorithms with combined coverage objectives perform competitively but do not statistically surpass the traditional greedy Total coverage baseline.
The study reveals that search-based techniques incur only marginal computational overhead, while greedy methods using historical performance change are notably efficient.
The paper recommends using non-coverage-based, greedy historical change methods as practical, low-overhead alternatives for effective microbenchmark prioritization.

Evaluating Search-Based Software Microbenchmark Prioritization

The paper under discussion explores the effectiveness and efficiency of Search-Based Software Microbenchmark Prioritization (SBSMBP) techniques, aimed at improving the detection of performance changes in software systems. Focusing on the empirical analysis of search-based and greedy techniques, the paper evaluates the role of various objectives in prioritizing microbenchmarks, highlighting the need for efficient methods to manage extensive runtimes inherent in performance testing.

Summary

The authors investigate SBSMBP as a solution to the difficulties posed by lengthy execution times of microbenchmarks, which are essential for ensuring software performance stability after code modifications. The paper compares novel search-based approaches with traditional greedy heuristics, notably those relying on code coverage as a proxy for fault detection. The studied techniques utilize objectives such as code coverage, coverage overlap, and historical performance change size.

The research employs an experimental setup that spans 10 open-source Java projects, encompassing 1829 distinct benchmarks across 161 software versions. The experiments reveal that the best-performing search-based genetic algorithm (GA) techniques do not statistically exceed the effectiveness of the Total greedy baseline, which prioritizes microbenchmarks based on maximum coverage.

Key Findings

Benchmark Effectiveness: The Genetic Algorithm with combined coverage objectives (C-CO-CH) shows competitive performance compared to the Total coverage baseline but does not surpass it. Greedy approaches, especially relying on the historical performance change objective (CH), sometimes outperform all others in terms of median effectiveness without necessitating coverage information.
Efficiency Considerations: The search-based techniques involve only marginal additional computational overhead compared to the greedy baselines. Greedy techniques relying on historical performance change are notably efficient, introducing less than 1% runtime overhead across various projects.
Implications for Practice: Application of non-coverage-based, greedy CH techniques is recommended for practitioners due to their low overhead and ease of implementation. The paper advocates for non-coverage-based objectives as potent alternatives for software microbenchmark prioritization.
Change-Awareness: Introducing change-awareness does not substantially impact the SBSMBP effectiveness, suggesting that simpler non-change-aware approaches might suffice, thus simplifying implementation without sacrificing performance.

Implications and Future Directions

This paper’s findings emphasize the challenge of surpassing traditional greedy methods with search-based approaches in the domain of microbenchmark prioritization. The results advocate for further exploration into alternative objectives that may better capture important performance changes, relying less on code coverage due to its associated overhead.

Future research could probe into more innovative objectives that address performance changes at different granularity levels or focus on real-world performance faults and developer-reported issues. Moreover, algorithmic innovations specifically tailored to SBSMBP may offer new insights into enhancing prioritization strategies.

The paper contributes to the evolving discourse on performance testing methodologies, underscoring the importance of balancing effectiveness with computational efficiency and advocating for practical, low-overhead solutions in continuous integration environments.

PDF Markdown

GitHub

GitHub - jenetics/jenetics: Jenetics - Genetic Algorithm, Genetic Programming, Grammatical Evolution, Evolutionary Algorithm, and Multi-objective Optimization (869 stars)

Tweets

https://twitter.com/ChristophLaaber/status/1770446435462344893

https://twitter.com/ComputerPapers/status/1781290030306132392

https://twitter.com/ComputerPapers/status/1767842770377421096

Evaluating Search-Based Software Microbenchmark Prioritization (2211.13525v4)

Summary

Evaluating Search-Based Software Microbenchmark Prioritization

Summary

Key Findings

Implications and Future Directions

Related Papers

GitHub

Tweets