Augmented Conversation with Embedded Speech-Driven On-the-Fly Referencing in AR (2405.18537v1)
Abstract: This paper introduces the concept of augmented conversation, which aims to support co-located in-person conversations via embedded speech-driven on-the-fly referencing in augmented reality (AR). Today computing technologies like smartphones allow quick access to a variety of references during the conversation. However, these tools often create distractions, reducing eye contact and forcing users to focus their attention on phone screens and manually enter keywords to access relevant information. In contrast, AR-based on-the-fly referencing provides relevant visual references in real-time, based on keywords extracted automatically from the spoken conversation. By embedding these visual references in AR around the conversation partner, augmented conversation reduces distraction and friction, allowing users to maintain eye contact and supporting more natural social interactions. To demonstrate this concept, we developed \system, a Hololens-based interface that leverages real-time speech recognition, natural language processing and gaze-based interactions for on-the-fly embedded visual referencing. In this paper, we explore the design space of visual referencing for conversations, and describe our our implementation -- building on seven design guidelines identified through a user-centered design process. An initial user study confirms that our system decreases distraction and friction in conversations compared to smartphone searches, while providing highly useful and relevant information.
- MeetCues: Supporting online meetings experience. In 2020 IEEE Visualization Conference (VIS). IEEE, 236–240.
- Tony Bergstrom and Karrie Karahalios. 2007. Conversation Clock: Visualizing audio patterns in co-located groups. In 2007 40th Annual Hawaii International Conference on System Sciences (HICSS’07). IEEE, 78–78.
- Tony Bergstrom and Karrie Karahalios. 2009. Conversation clusters: grouping conversation topics through human-computer dialog. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2349–2352.
- Ruud Custers and Henk Aarts. 2005. Positive affect as implicit motivator: on the nonconscious operation of behavioral goals. Journal of personality and social psychology 89, 2 (2005), 129.
- Crowdsourcing design guidance for contextual adaptation of text content in augmented reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.
- Smartphone use undermines enjoyment of face-to-face social interactions. Journal of Experimental Social Psychology 78 (2018), 233–239.
- Are our mobile phones driving us apart? Divert attention from mobile phones back to physical conversation!. In Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct. 1082–1087.
- CIIC ESTG. 2019. Smart Time: a Context-Aware Conversational Agent for Suggesting Free Time Activities. (2019).
- Deaf and hard-of-hearing individuals’ preferences for wearable and mobile sound awareness technologies. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13.
- HoloBoard: an Immersive Teaching Board System. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–4.
- Alex Haigh. 2015. Stop phubbing. Artikel Online. Tersedia pada http://stopphubbing. com (2015).
- John B Horrigan and Maeve Duggan. 2015. Home broadband 2015. Pew Research Center 21 (2015).
- Hiroshi Ishii and Minoru Kobayashi. 1992. Clearboard: A seamless medium for shared drawing and conversation with eye contact. In Proceedings of the SIGCHI conference on Human factors in computing systems. 525–532.
- Exploring augmented reality approaches to real-time captioning: A preliminary autoethnographic study. In Proceedings of the 2018 ACM Conference Companion Publication on Designing Interactive Systems. 7–11.
- Head-mounted display visualizations to support sound awareness for the deaf and hard of hearing. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 241–250.
- Towards Accessible Conversations in a Mobile Context for People who are Deaf and Hard of Hearing. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. 81–92.
- Shop-i: Gaze based interaction in the physical world for in-store social shopping experience. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. 1253–1258.
- Evaluating the combination of visual communication cues for HMD-based mixed reality remote collaboration. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–13.
- Meeting mediator: enhancing group collaborationusing sociometric feedback. In Proceedings of the 2008 ACM conference on Computer supported cooperative work. 457–466.
- RealityTalk: Real-time speech-driven augmented presentation for AR live storytelling. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–12.
- Context-aware online adaptation of mixed reality interfaces. In Proceedings of the 32nd annual ACM symposium on user interface software and technology. 147–160.
- Managing smartphone interruptions through adaptive modes and modulation of notifications. In Proceedings of the 20th International Conference on Intelligent User Interfaces. 296–299.
- Glanceable ar: Evaluating information access methods for head-worn augmented reality. In 2020 IEEE conference on virtual reality and 3D user interfaces (VR). IEEE, 930–939.
- Sus Lundgren and Olof Torgersson. 2013. Bursting the mobile bubble. In First International Workshop on Designing Mobile Face-to-Face Group Interactions, European Conference on Computer Supported Cooperative Work, ECSCW, Vol. 2013.
- Rada Mihalcea and Paul Tarau. 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing. 404–411.
- The use of smart glasses for lecture comprehension by deaf and hard of hearing students. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. 1909–1915.
- Teachable reality: Prototyping tangible augmented reality with everyday objects by leveraging interactive machine teaching. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–15.
- B Ben Mosbah. 2006. Speech recognition for disabilities people. In 2006 2nd International Conference on Information & Communication Technologies, Vol. 1. IEEE, 864–869.
- Technology at the table: Attitudes about mobile phone use at mealtimes. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 1881–1892.
- Cloudbits: supporting conversations through augmented zero-query search visualization. In Proceedings of the 5th Symposium on Spatial User Interaction. 30–38.
- The known stranger: Supporting conversations between strangers with personalized topic suggestions. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 555–564.
- Wearable subtitles: Augmenting spoken communication with lightweight eyewear for all-day captioning. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 1108–1120.
- Speechbubbles: enhancing captioning experiences for deaf and hard-of-hearing people in group conversations. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–10.
- ARtention: A design space for gaze-adaptive user interfaces in augmented reality. Computers & Graphics 95 (2021), 1–12.
- Empirical evaluation of gaze-enhanced menus in virtual reality. In 26th ACM symposium on virtual reality software and technology. 1–11.
- Looking for Info: Evaluation of Gaze Based Information Retrieval in Augmented Reality. In IFIP Conference on Human-Computer Interaction. Springer, 544–565.
- Andrew K Przybylski and Netta Weinstein. 2013. Can you connect with me now? How the presence of mobile communication technology influences face-to-face conversation quality. Journal of Social and Personal Relationships 30, 3 (2013), 237–246.
- Towards Ambient Search.. In LWA. 257–259.
- Bradley James Rhodes and Pattie Maes. 2000. Just-in-time information retrieval agents. IBM Systems journal 39, 3.4 (2000), 685–704.
- Stare: gaze-assisted face-to-face communication in augmented reality. In ACM Symposium on Eye Tracking Research and Applications. 1–5.
- Chris Schipper and Bo Brinkman. 2017. Caption placement on an augmented reality head worn device. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. 365–366.
- James Shah. 2003. The motivational looking glass: how significant others implicitly affect goal appraisals. Journal of personality and social psychology 85, 3 (2003), 424.
- Meetingvis: Visual narratives to assist in recalling meeting context and content. IEEE Transactions on Visualization and Computer Graphics 24, 6 (2018), 1918–1929.
- Lara Srivastava. 2005. Mobile phones and the evolution of social behaviour. Behaviour & information technology 24, 2 (2005), 111–129.
- Norman Makoto Su and Lulu Wang. 2015. From third to surveilled place: The mobile in Irish pubs. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 1659–1668.
- Caption support system for complementary dialogical information using see-through head mounted display. In 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE). IEEE, 368–371.
- XR and AI: AI-Enabled Virtual, Augmented, and Mixed Reality. In Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–3.
- Realitysketch: Embedding responsive graphics and visualizations in AR through dynamic sketching. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 166–181.
- Conversational greeting detection using captioning on head worn displays versus smartphones. In Proceedings of the 2020 International Symposium on Wearable Computers. 84–86.
- Sherry Turkle. 2012. Alone together: why we expect more form technology and less from each other. Basic Books, a member of the Perseus Books Group.
- Deniz Yazıcıoğlu. 2017. THE SMARTPHONE AFFECT: The Emotional Impact of Smartphone Usage in Public Spaces and it’s Affects on the Subjective Experience of Public Space’s Sociality.
- Shivesh Jadon (3 papers)
- Mehrad Faridan (6 papers)
- Edward Mah (1 paper)
- Rajan Vaish (21 papers)
- Wesley Willett (8 papers)
- Ryo Suzuki (61 papers)