MutaBot: A Mutation Testing Approach for Chatbots
Abstract: Mutation testing is a technique aimed at assessing the effectiveness of test suites by seeding artificial faults into programs. Although available for many platforms and languages, no mutation testing tool is currently available for conversational chatbots, which represent an increasingly popular solution to design systems that can interact with users through a natural language interface. Note that since conversations must be explicitly engineered by the developers of conversational chatbots, these systems are exposed to specific types of faults not supported by existing mutation testing tools. In this paper, we present MutaBot, a mutation testing tool for conversational chatbots. MutaBot addresses mutations at multiple levels, including conversational flows, intents, and contexts. We designed the tool to potentially target multiple platforms, while we implemented initial support for Google Dialogflow chatbots. We assessed the tool with three Dialogflow chatbots and test cases generated with Botium, revealing weaknesses in the test suites.
- Eleni Adamopoulou and Lefteris Moussiades. 2020a. Chatbots: History, technology, and applications. Machine Learning with Applications 2 (2020), 100006.
- Eleni Adamopoulou and Lefteris Moussiades. 2020b. An overview of chatbot technology. In IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI). Springer, 373–383.
- CrossASR: Efficient differential testing of automatic speech recognition via text-to-speech. In Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 640–650.
- Bespoken. 2023. Bespoken Documentation. https://read.bespoken.io/ Visited: October 2023.
- Botium. 2023. Botium Documentation. https://botium-docs.readthedocs.io/en/latest/ Visited: October 2023.
- Josip Božić. 2022. Ontology-based metamorphic testing for chatbots. Software Quality Journal (SQJ) 30, 1 (2022), 227–251.
- Chatbot testing using AI planning. In Proceedings of the 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, 37–44.
- Testing chatbots with Charm. In Proceedings of the 13th International Conference on the Quality of Information and Communications Technology (QUATIC). Springer, 426–438.
- Automating the measurement of heterogeneous chatbot designs. In Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing (SAC). ACM, 1491–1498.
- Chatbottest. 2023. Chatbottest. https://chatbottest.com/ Visited: October 2023.
- Survey on evaluation methods for dialogue systems. Artificial Intelligence Review 54 (2021), 755–810.
- Dialogflow. 2023. Dialogflow Documentation. https://cloud.google.com/dialogflow/docs Visited: October 2023.
- Maryia Fokina. 2023. The Future of Chatbots: 80+ Chatbot Statistics for 2023. https://www.tidio.com/blog/chatbot-statistics/ Visited: October 2023.
- Jonathan Grudin and Richard Jacques. 2019. Chatbots, humbots, and the quest for artificial general intelligence. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 1–11.
- Assessing the robustness of conversational agents using paraphrases. In Proceedings of the 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, 55–62.
- Futoshi Iwama and Takashi Fukuda. 2019. Automated testing of basic recognition capability for speech recognition systems. In Proceedings of the 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). IEEE, 13–24.
- Yue Jia and Mark Harman. 2010. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering (TSE) 37, 5 (2010), 649–678.
- Amazon Lex. 2023. Amazon Lex Documentation. https://docs.aws.amazon.com/lex/ Visited: October 2023.
- A Review of Quality Assurance Research of Dialogue Systems. In Proceedings of the 2022 IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, 87–94.
- Model-driven chatbot development. In Proceedings of the 39th International Conference on Conceptual Modeling (CM). Springer, 207–222.
- Creating and migrating chatbots with CONGA. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-C). IEEE, 37–40.
- Choosing a chatbot development tool. IEEE Software 38, 4 (2021), 94–103.
- Playwright. 2023. Playwright. https://playwright.dev/ Visited: October 2023.
- QBox. 2023. QBox. https://qbox.ai/ Visited: October 2023.
- Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 5231–5240.
- Rasa. 2023. Rasa Documentation. https://rasa.com/docs/ Visited: October 2023.
- Botest: A framework to test the quality of conversational agents using divergent input examples. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion (IUI-C). ACM, 1–2.
- Bayan Abu Shawar and Eric Atwell. 2007. Chatbots: Are they really useful? Journal for Language Technology and Computational Linguistics 22, 1 (2007), 29–49.
- Alexa Simulator. 2023. Alexa Simulator Documentation. https://developer.amazon.com/en-US/docs/alexa/devconsole/alexa-simulator.html Visited: October 2023.
- Bottester: Testing conversational systems with simulated users. In Proceedings of the Brazilian Symposium on Human Factors in Computing Systems. ACM, 1–4.
- Research progress of flaky tests. In Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 639–646.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.