ChatGPT and Human Synergy in Black-Box Testing: A Comparative Analysis (2401.13924v1)
Abstract: In recent years, LLMs, such as ChatGPT, have been pivotal in advancing various artificial intelligence applications, including natural language processing and software engineering. A promising yet underexplored area is utilizing LLMs in software testing, particularly in black-box testing. This paper explores the test cases devised by ChatGPT in comparison to those created by human participants. In this study, ChatGPT (GPT-4) and four participants each created black-box test cases for three applications based on specifications written by the authors. The goal was to evaluate the real-world applicability of the proposed test cases, identify potential shortcomings, and comprehend how ChatGPT could enhance human testing strategies. ChatGPT can generate test cases that generally match or slightly surpass those created by human participants in terms of test viewpoint coverage. Additionally, our experiments demonstrated that when ChatGPT cooperates with humans, it can cover considerably more test viewpoints than each can achieve alone, suggesting that collaboration between humans and ChatGPT may be more effective than human pairs working together. Nevertheless, we noticed that the test cases generated by ChatGPT have certain issues that require addressing before use.
- “Adaptive Test Generation Using a Large Language Model” arXiv, 2023 arXiv: http://arxiv.org/abs/2302.06527
- “Finding Failure-Inducing Test Cases with ChatGPT” arXiv, 2023 arXiv: http://arxiv.org/abs/2304.11686
- Zubair Khaliq, Sheikh Umar Farooq and Dawood Ashraf Khan “Transformers for GUI Testing: A Plausible Solution to Automated Test Case Generation and Flaky Tests” In Computer 55.3, 2022, pp. 64–73 DOI: 10.1109/MC.2021.3136791
- “ChatGPT and Software Testing Education: Promises & Perils” arXiv, 2023 arXiv: http://arxiv.org/abs/2302.03287
- “Chatting with GPT-3 for Zero-Shot Human-Like Mobile Automated GUI Testing” arXiv, 2023 arXiv: http://arxiv.org/abs/2305.09434
- “Language Models are Few-Shot Learners” arXiv, 2020 DOI: 10.48550/arXiv.2005.14165
- “Perturbation Validation: A New Heuristic to Validate Machine Learning Models”, 2020 arXiv: http://arxiv.org/abs/1905.10201
- Haopeng Zhang, Xiao Liu and Jiawei Zhang “Extractive Summarization via ChatGPT for Faithful Summary Generation” arXiv, 2023 DOI: 10.48550/arXiv.2304.04193
- Roee Aharoni, Melvin Johnson and Orhan Firat “Massively Multilingual Neural Machine Translation” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) Minneapolis, Minnesota: Association for Computational Linguistics, 2019, pp. 3874–3884 DOI: 10.18653/v1/N19-1388
- Yoshua Bengio, Réjean Ducharme and Pascal Vincent “A Neural Probabilistic Language Model” In Advances in Neural Information Processing Systems 13 MIT Press, 2000 URL: https://papers.nips.cc/paper_files/paper/2000/hash/728f206c2a01bf572b5940d7d9a8fa4c-Abstract.html
- “Is ChatGPT a Good Recommender? A Preliminary Study” arXiv, 2023 arXiv: http://arxiv.org/abs/2304.10149
- “GPTScore: Evaluate as You Desire” arXiv, 2023 DOI: 10.48550/arXiv.2302.04166
- Zheheng Luo, Qianqian Xie and Sophia Ananiadou “ChatGPT as a Factual Inconsistency Evaluator for Text Summarization” arXiv, 2023 DOI: 10.48550/arXiv.2303.15621
- “Attention is all you need” In Advances in neural information processing systems 30, 2017
- “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) Minneapolis, Minnesota: Association for Computational Linguistics, 2019, pp. 4171–4186 DOI: 10.18653/v1/N19-1423
- “Language Models are Unsupervised Multitask Learners”
- “XLNet: Generalized Autoregressive Pretraining for Language Understanding” In Advances in Neural Information Processing Systems 32 Curran Associates, Inc., 2019 URL: https://proceedings.neurips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html
- OpenAI “GPT-4 Technical Report” arXiv, 2023 DOI: 10.48550/arXiv.2303.08774
- “Generating test cases from UML activity diagram based on Gray-box method” ISSN: 1530-1362 In 11th Asia-Pacific Software Engineering Conference, 2004, pp. 284–291 DOI: 10.1109/APSEC.2004.55
- “A UML based approach to system testing” In Software and Systems Modeling 1, 2002 DOI: 10.1007/s10270-002-0004-8
- “Synthesis of test scenarios using UML activity diagrams” In Software and System Modeling 10, 2011, pp. 63–89 DOI: 10.1007/s10270-009-0133-4
- Bill Hasling, Helmut Goetz and Klaus Beetz “Model Based Testing of System Requirements using UML Use Case Models”, 2008, pp. 367–376 DOI: 10.1109/ICST.2008.9
- “Slicing-based test case generation from UML activity diagrams” In ACM SIGSOFT Software Engineering Notes 34, 2009, pp. 1–14 DOI: 10.1145/1640162.1666579
- “Automatic Test Generation: A Use Case Driven Approach” In Software Engineering, IEEE Transactions on 32, 2006, pp. 140–155 DOI: 10.1109/TSE.2006.22
- J.J. Gutiérrez, M.J. Escalona and M. Mejías “A Model-Driven approach for functional test case generation” In Journal of Systems and Software 109, 2015, pp. 214–228 DOI: 10.1016/j.jss.2015.08.001
- “Automated Test Case Generation from Dynamic Models” In ECOOP 2000 — Object-Oriented Programming, Lecture Notes in Computer Science Berlin, Heidelberg: Springer, 2000, pp. 472–491 DOI: 10.1007/3-540-45102-1˙23
- Valdivino Santiago Júnior and Nandamudi Vijaykumar “Generating model-based test cases from natural language requirements for space application software” In Software Quality Journal 20, 2012, pp. 77–143 DOI: 10.1007/s11219-011-9155-6
- Satoshi Masuda, Tohru Matsuodani and Kazuhiko Tsuda “Automatic Generation of Test Cases Using Document Analysis Techniques”, 2016
- Daniel Leitao, Dante Torres and Flávia Barros “NLForSpec: Translating Natural Language Descriptions into Formal Test Case Specifications.”, 2007, pp. 129–134
- “Automatic Test Case and Test Oracle Generation Based on Functional Scenarios in Formal Specifications for Conformance Testing” Conference Name: IEEE Transactions on Software Engineering In IEEE Transactions on Software Engineering 48.2, 2022, pp. 691–712 DOI: 10.1109/TSE.2020.2999884
- Jing Yang, Mohamed Ghazel and El-Miloudi El-Koursi “From formal specifications to efficient test scenarios generation” In 2013 International Conference on Advanced Logistics and Transport, 2013, pp. 35–40 DOI: 10.1109/ICAdLT.2013.6568431
- “Test scenario generation based on formal specification and usage profile” Publisher: World Scientific Publishing Co. In International Journal of Software Engineering and Knowledge Engineering 10.2, 2000, pp. 185–201 DOI: 10.1142/S0218194000000110
- Hiroyuki Kirinuki (1 paper)
- Haruto Tanno (1 paper)