SBFT Tool Competition 2024 -- Python Test Case Generation Track (2401.15189v1)
Abstract: Test case generation (TCG) for Python poses distinctive challenges due to the language's dynamic nature and the absence of strict type information. Previous research has successfully explored automated unit TCG for Python, with solutions outperforming random test generation methods. Nevertheless, fundamental issues persist, hindering the practical adoption of existing test case generators. To address these challenges, we report on the organization, challenges, and results of the first edition of the Python Testing Competition. Four tools, namely UTBotPython, Klara, Hypothesis Ghostwriter, and Pynguin were executed on a benchmark set consisting of 35 Python source files sampled from 7 open-source Python projects for a time budget of 400 seconds. We considered one configuration of each tool for each test subject and evaluated the tools' effectiveness in terms of code and mutation coverage. This paper describes our methodology, the analysis of the results together with the competing tools, and the challenges faced while running the competition experiments.
- 2024. Ansible. https://github.com/ansible/ansible
- 2024. cosmic-ray. https://cosmic-ray.readthedocs.io/en/latest/
- 2024. Django. https://github.com/django/django
- 2024. flask. https://github.com/pallets/flask
- 2024. Klara. https://github.com/usagitoneko97/klara
- 2024. Numpy. https://github.com/numpy/numpy
- 2024. pytest. https://docs.pytest.org/en
- 2024. scikit-learn. https://github.com/scikit-learn/scikit-learn
- 2024. Spark. https://github.com/apache/spark
- 2024. TensorFlow. https://github.com/tensorflow/tensorflow
- 2024. UTBotPython. https://github.com/UnitTestBot/UTBotPythonSBFT2024
- On the Usage of Pythonic Idioms. In ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. ACM, 1–11. https://doi.org/10.1145/3276954.3276960
- Andrea Arcuri and Lionel Briand. 2014. A Hitchhiker’s Guide to Statistical Tests for Assessing Randomized Algorithms in Software Engineering. Software Testing, Verification & Reliability 24, 3 (2014), 219–250. https://doi.org/10.1002/stvr.1486
- Anna Derezinska and Konrad Halas. 2014. Experimental Evaluation of Mutation Testing Approaches to Python Programs. In International Conference on Software Testing, Verification and Validation. IEEE Computer Society, 156–164. https://doi.org/10.1109/ICSTW.2014.24
- JUGE: An infrastructure for benchmarking Java unit test generators. Software Testing, Verification and Reliability 33, 3 (2023). https://doi.org/10.1002/STVR.1838
- SBFT Tool Competition 2024 - Python Test Case Generation Track. https://doi.org/10.5281/zenodo.10554259
- Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test Case Generation. IEEE Transactions on Software Engineering 47, 11 (2021), 2332–2347.
- Gunel Jahangirova and Valerio Terragni. 2023. SBFT Tool Competition 2023 - Java Test Case Generation Track. In International Workshop on Search-Based and Fuzz Testing. IEEE, 61–64. https://doi.org/10.1109/SBFT59156.2023.00025
- SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track. In International Workshop on Search-Based and Fuzz Testing. ACM.
- Stephan Lukasczyk and Gordon Fraser. 2022. Pynguin: automated unit test generation for Python. In International Conference on Software Engineering: Companion. ACM, 168–172. https://doi.org/10.1145/3510454.3516829
- An empirical study of automated unit test generation for Python. Empirical Software Engineering 28, 2 (2023), 36. https://doi.org/10.1007/S10664-022-10248-W
- David Maciver and Zac Hatfield-Dodds. 2019. Hypothesis: A new approach to property-based testing. Journal of Open Source Software 4, 43 (2019), 1891. https://doi.org/10.21105/JOSS.01891
- Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets. IEEE Transactions on Software Engineering 44, 2 (2018), 122–158. https://doi.org/10.1109/TSE.2017.2663435
- SBST Tool Competition 2021. In International Workshop on Search-Based Software Testing. IEEE, 20–27. https://doi.org/10.1109/SBST52555.2021.00011
- The impact of test case summaries on bug fixing performance: an empirical investigation. In International Conference on Software Engineering. ACM, 547–558. https://doi.org/10.1145/2884781.2884847
- How to identify class comment types? A multi-language approach for class comment classification. Journal of Systems and Software 181 (2021), 111047. https://doi.org/10.1016/J.JSS.2021.111047