Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The State of Speech in HCI: Trends, Themes and Challenges (1810.06828v1)

Published 16 Oct 2018 in cs.HC

Abstract: Speech interfaces are growing in popularity. Through a review of 68 research papers this work maps the trends, themes, findings and methods of empirical research on speech interfaces in HCI. We find that most studies are usability/theory-focused or explore wider system experiences, evaluating Wizard of Oz, prototypes, or developed systems by using self-report questionnaires to measure concepts like usability and user attitudes. A thematic analysis of the research found that speech HCI work focuses on nine key topics: system speech production, modality comparison, user speech production, assistive technology & accessibility, design insight, experiences with interactive voice response (IVR) systems, using speech technology for development, people's experiences with intelligent personal assistants (IPAs) and how user memory affects speech interface interaction. From these insights we identify gaps and challenges in speech research, notably the need to develop theories of speech interface interaction, grow critical mass in this domain, increase design work, and expand research from single to multiple user interaction contexts so as to reflect current use contexts. We also highlight the need to improve measure reliability, validity and consistency, in the wild deployment and reduce barriers to building fully functional speech interfaces for research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Leigh Clark (16 papers)
  2. Phillip Doyle (1 paper)
  3. Diego Garaialde (6 papers)
  4. Emer Gilmartin (4 papers)
  5. Stephan Schlögl (12 papers)
  6. Jens Edlund (1 paper)
  7. Matthew Aylett (4 papers)
  8. João Cabral (2 papers)
  9. Cosmin Munteanu (4 papers)
  10. Benjamin Cowan (4 papers)
Citations (195)

Summary

Overview of "The State of Speech in HCI: Trends, Themes and Challenges"

Introduction

The paper entitled "The State of Speech in HCI: Trends, Themes and Challenges" offers an extensive review of empirical research in the domain of speech interfaces within Human-Computer Interaction (HCI). By analyzing 68 research papers, the authors provide a comprehensive mapping of trends, themes, methodologies, and findings in this field. The growth of speech interfaces, from interactive voice response (IVR) systems to intelligent personal assistants (IPAs) such as Amazon Alexa and Apple Siri, has spurred considerable empirical research interest. This summary elucidates key points discussed in the paper, highlights methodological trends, and outlines challenges for future research.

Key Findings

The paper categorizes the empirical research into nine primary topics: system speech production, modality comparison, user speech production, assistive technology & accessibility, design insights, experiences with IVR systems, speech technology for development, users' experiences with IPAs, and the effect of user memory on speech interface interactions. A notable emphasis is placed on usability and theory-focused research, as well as wider system experiences. Most studies rely heavily on Wizard of Oz systems, interactive prototypes, or developed systems, utilizing self-report questionnaires as a measure of usability and user attitudes. However, the authors critique the reliability and validity of these measures, which often lack standardization.

Methodological Approaches

The predominant methodological approach in the reviewed papers is quantitative, with a marked preference for objective measures of user interaction. Despite the frequent use of questionnaires, only a few studies employ validated scales such as SASSI or SUS. Observational data and interviews complement quantitative measures, providing deeper insights into user behavior and experience. However, the limited use of qualitative and design-oriented research approaches suggests a gap in understanding user interaction dynamics and design implications.

Research and Methodological Challenges

The authors outline several critical research challenges. These include the need to develop robust theories specific to speech interface interactions, achieving a critical mass in speech-related research topics, and enhancing design research to inform user-centered speech interface development. Furthermore, the investigation of multi-user interaction contexts is essential, considering the social nature of devices like IPAs.

Methodological challenges are addressed through calls for improved reliability and validity of evaluation measures, in particular the development and consistent use of validated subjective scales. The paper also advocates for increased "in-the-wild" deployment studies, which could bridge the gap between lab-based findings and real-world applications. Reducing barriers to creating functional speech interface prototypes will enable more iterative design and usability research.

Implications and Future Trends

The paper's findings have significant implications for the future trajectory of speech interface research within HCI. As speech interfaces become more integrated into daily life, the necessity for robust theories and validated evaluative measures will become increasingly critical. Practically, enhancing the usability and reliability of existing systems could lead to better user adoption and satisfaction. Theoretical advancements could guide the development of more intuitive and adaptive interface designs. Future studies should focus on enhancing our understanding of user interaction with speech interfaces, particularly in diverse and socially rich environments, encouraging a more holistic view of speech-enabled systems in HCI.

In conclusion, "The State of Speech in HCI: Trends, Themes and Challenges" provides a pivotal foundation for understanding current research in speech interfaces. It effectively articulates the trends, methodological approaches, and thematic focuses within speech HCI, offering a roadmap for future research endeavors aimed at refining theory, design, and practical applications of speech-enabled systems.