Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Prosody for Intuitive Robotic Interface Design: It's Not What You Said, It's How You Said It (2403.08144v1)

Published 13 Mar 2024 in cs.RO and cs.HC

Abstract: In this paper, we investigate the use of 'prosody' (the musical elements of speech) as a communicative signal for intuitive human-robot interaction interfaces. Our approach, rooted in Research through Design (RtD), examines the application of prosody in directing a quadruped robot navigation. We involved ten team members in an experiment to command a robot through an obstacle course using natural interaction. A human operator, serving as the robot's sensory and processing proxy, translated human communication into a basic set of navigation commands, effectively simulating an intuitive interface. During our analysis of interaction videos, when lexical and visual cues proved insufficient for accurate command interpretation, we turned to non-verbal auditory cues. Qualitative evidence suggests that participants intuitively relied on prosody to control robot navigation. We highlight specific distinct prosodic constructs that emerged from this preliminary exploration and discuss their pragmatic functions. This work contributes a discussion on the broader potential of prosody as a multifunctional communicative signal for designing future intuitive robotic interfaces, enabling lifelong learning and personalization in human-robot interaction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: a review. Journal of Human-Robot Interaction 6, 1 (2017), 25–63.
  2. Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.
  3. Kate Darling. 2021. The new breed: what our history with animals reveals about our future with robots. Henry Holt and Company.
  4. Towards a standard set of acoustic features for the processing of emotion in speech.. In Proceedings of Meetings on Acoustics, Vol. 9. AIP Publishing.
  5. Opensmile: the munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on Multimedia. 1459–1462.
  6. Modelling state of interaction from head poses for social human-robot interaction. In Proceedings of the Gaze in Human-Robot Interaction Workshop held at the 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI 2012).
  7. Social behavior recognition using body posture and head pose for human-robot interaction. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2128–2133.
  8. Dog brains are sensitive to infant-and dog-directed prosody. Communications Biology 6, 1 (2023), 859.
  9. Rapid contextual design: a how-to guide to key techniques for user-centered design. Elsevier.
  10. Artificial sounds following biological rules: A novel approach for non-verbal communication in HRI. Scientific reports 10, 1 (2020), 7080.
  11. D Robert Ladd Jr. 1978. Stylized intonation. Language 54, 3 (1978), 517–540.
  12. Paisley Lunchick. 2023. Teach your puppy these 5 basic cues. https://www.akc.org/expert-advice/training/teach-your-puppy-these-5-basic-commands/
  13. Designerly ways of knowing in HRI: Broadening the scope of design-oriented HRI through the concept of intermediate-level knowledge. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction. 389–398.
  14. Spoken language interaction with robots: Recommendations for future research. Computer Speech & Language 71 (2022), 101255.
  15. Ronald Mascitelli. 2000. From experience: harnessing tacit knowledge to achieve breakthrough innovation. Journal of Product Innovation Management: an International Publication of the Product Development & Management Association 17, 3 (2000), 179–193.
  16. Laurel D Riek. 2012. Wizard of oz studies in hri: a systematic review and new reporting guidelines. Journal of Human-Robot Interaction 1, 1 (2012), 119–136.
  17. Björn Schuller and Anton Batliner. 2013. Computational paralinguistics: emotion, affect and personality in speech and language processing. John Wiley & Sons.
  18. Gabriel Skantze. 2017. Towards a general, continuous model of turn-taking in spoken dialogue using LSTM recurrent neural networks. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 220–230.
  19. Nigel G Ward. 2019a. Prosodic patterns in English conversation. Cambridge University Press.
  20. Nigel G. Ward. 2019b. Survey Talk: Prosody Research and Applications: The State of the Art. In Proc. Interspeech 2019.
  21. Nigel G Ward and Gina-Anne Levow. 2021. Computational Prosody Starter Bibliography. (2021).
  22. Toward the evolutionary roots of affective prosody in human acoustic communication: a comparative approach to mammalian voices. Evolution of emotional communication: from sounds in nonhuman mammals to speech and music in man 116 (2013), 132.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets