Evaluating Quality of Chatbots and Intelligent Conversational Agents

Published 15 Apr 2017 in cs.CY and cs.SE | (1704.04579v1)

Abstract: Chatbots are one class of intelligent, conversational software agents activated by natural language input (which can be in the form of text, voice, or both). They provide conversational output in response, and if commanded, can sometimes also execute tasks. Although chatbot technologies have existed since the 1960s and have influenced user interface development in games since the early 1980s, chatbots are now easier to train and implement. This is due to plentiful open source code, widely available development platforms, and implementation options via Software as a Service (SaaS). In addition to enhancing customer experiences and supporting learning, chatbots can also be used to engineer social harm - that is, to spread rumors and misinformation, or attack people for posting their thoughts and opinions online. This paper presents a literature review of quality issues and attributes as they relate to the contemporary issue of chatbot development and implementation. Finally, quality assessment approaches are reviewed, and a quality assessment method based on these attributes and the Analytic Hierarchy Process (AHP) is proposed and examined.

Abstract PDF Upgrade to Chat

Citations (322)

View on Semantic Scholar

Summary

The paper presents a structured method using AHP to evaluate chatbot quality by linking performance with usability criteria.
It examines the evolution from early systems like ELIZA to modern AI-driven agents integrated with social media and VR.
The study defines robust quality attributes and offers practical insights for optimizing user interactions and ethical deployment.

Evaluating Quality of Chatbots and Intelligent Conversational Agents

The paper authored by Nicole Radziwill and Morgan Benton offers a comprehensive examination of chatbot quality features and their significance in the growing field of intelligent conversational agents. By exploring the historical context, the paper situates chatbots within a larger framework of technological evolution, noting the transition from systems like ELIZA to more sophisticated implementations that leverage machine learning and social media integrations. Central to the discussion are the varied applications of chatbots, which span beneficial uses in customer service to malicious activities intended for social manipulation.

Historical and Technological Context

The historical trajectory detailed in the paper explores the advancements from primitive systems intended to emulate natural language conversations to contemporary agents empowered by artificial intelligence. With the development of AIML and improved speech recognition technologies, chatbots have navigated beyond traditional text-based environments into intricate roles in Virtual Reality contexts and beyond. The paper positions these advancements as a response to technological integrations, particularly via Software as a Service (SaaS) platforms, which have democratized access and facilitated widespread deployment.

Quality Assessment and Attributes

Significant attention is given to defining the quality attributes integral to the efficacy of chatbots and conversational agents. The authors provide a structured analysis linking these attributes to the ISO 9241 framework of usability, encompassing effectiveness, efficiency, and satisfaction. This approach delineates essential performance categories, such as robustness and task execution, alongside linguistic functionality and user interaction, crucial for evaluating chatbot performance.

Methodological Approach

The paper's robust literature review strategy pinpoints chatbot quality attributes, drawing from interdisciplinary sources over nearly three decades. The methodical selection process and exclusion criteria ensure relevance and rigor in the quality assessment discourse. This method facilitates the creation of a prioritized list of quality attributes, enabling a systematic evaluation of chatbot implementations across different stages.

Assessment Techniques: Analytic Hierarchy Process (AHP)

One of the paper's critical contributions is the proposal of a structured quality assessment method using the Analytic Hierarchy Process (AHP). AHP offers a means to navigate complex decision-making scenarios by considering both qualitative and quantitative factors. This approach allows for a hierarchical organization of quality attributes and facilitates pairwise comparisons among them to assess comparative priorities and performance. By demonstrating how to apply AHP effectively, the paper provides a practical framework for quality assurance across diverse chatbot implementations.

Implications and Future Directions

The exploration of quality assurance approaches—particularly AHP—offers implications for both theoretical and practical applications in artificial intelligence. By systematically evaluating quality, developers are better positioned to enhance user interactions, optimize chatbot designs for specific applications, and mitigate potential harm from malicious implementations.

As chatbots continue to evolve, their integration with emerging technologies like VR promises further transformations in the user interface landscape. Consequently, understanding and refining quality attributes will remain pivotal to their development. Future research may extend this work by exploring dynamic adaptability in real-time applications and addressing challenges related to ethical deployment.

In conclusion, this paper represents a meticulous effort to demystify the multifaceted quality landscape of chatbots and conversational agents. By proposing rigorous analytical tools such as AHP, it extends a structured pathway for future advancements in chatbot technology, emphasizing the balance between efficiency and ethical deployment in digital ecosystems.

Markdown