Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through Large Language Models (2401.11923v2)

Published 22 Jan 2024 in cs.HC

Abstract: Tour guidance in virtual museums encourages multi-modal interactions to boost user experiences, concerning engagement, immersion, and spatial awareness. Nevertheless, achieving the goal is challenging due to the complexity of comprehending diverse user needs and accommodating personalized user preferences. Informed by a formative study that characterizes guidance-seeking contexts, we establish a multi-modal interaction design framework for virtual tour guidance. We then design VirtuWander, a two-stage innovative system using domain-oriented LLMs to transform user inquiries into diverse guidance-seeking contexts and facilitate multi-modal interactions. The feasibility and versatility of VirtuWander are demonstrated with virtual guiding examples that encompass various touring scenarios and cater to personalized preferences. We further evaluate VirtuWander through a user study within an immersive simulated museum. The results suggest that our system enhances engaging virtual tour experiences through personalized communication and knowledgeable assistance, indicating its potential for expanding into real-world scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Virtual museums as a means for promotion and enhancement of cultural heritage. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 42 (2019), 33–40. https://doi.org/10.5194/isprs-archives-XLII-2-W15-33-2019
  2. Lost in Style: Gaze-driven Adaptive Aid for VR Navigation. In Proc. ACM CHI. Association for Computing Machinery, New York, USA, 1–12. https://doi.org/10.1145/3290605.3300578
  3. “You, Move There!”: Investigating the Impact of Feedback on Voice Control in Virtual Environments. In Proceedings of the 3rd Conference on Conversational User Interfaces. Association for Computing Machinery, New York, USA, Article 14, 9 pages. https://doi.org/10.1145/3469595.3469609
  4. A platform for virtual museums with personalized content. Multimedia tools and applications 42 (2009), 139–159. https://doi.org/10.1007/s11042-008-0231-2
  5. A methodology for the evaluation of travel techniques for immersive virtual environments. Virtual reality 3, 2 (1998), 120–131. https://doi.org/10.1007/BF01417673
  6. Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis. Qualitative Research in Sport, Exercise and Health 11, 4 (2019), 589–597. https://doi.org/10.1080/2159676X.2019.1628806
  7. Museum of Interface: Designing the virtual environment. Proceedings of the Fifth Conference on Computer Aided Architectural Design Research in Asia (2000), 471–480. https://doi.org/10.52842/conf.caadria.2000.471
  8. Deep Reinforcement Learning from Human Preferences. In NeurIPS, Vol. 30. 430–4310. https://proceedings.neurips.cc/paper_files/paper/2017/file/d5e2c0adad503c91f91df240d0cd4e49-Paper.pdf
  9. Multisensory Interactive Storytelling to Augment the Visit of a Historical House Museum. In International Conference on Virtual Systems & Multimedia. 1–8. https://doi.org/10.1109/DigitalHeritage.2018.8810099
  10. Linda Daniela. 2020. Virtual Museums as Learning Agents. Sustainability 12, 7 (2020), 2698. https://doi.org/10.3390/su12072698
  11. Redefining the digital paradigm for virtual museums: Towards interactive and engaging experiences in the post-pandemic era. In International Conference on Human-Computer Interaction. Springer, 357–373. https://doi.org/10.1007/978-3-030-77411-0_23
  12. Antonina Dattolo and Flaminia L Luccio. 2008. Visualizing Personalized Views in Virtual Museum Tours. In 2008 Conference on Human System Interactions. IEEE, 109–114. https://doi.org/10.1109/HSI.2008.4581418
  13. Nicola Davis. 2015. Don’t just look–smell, feel, and hear art. Tate’s new way of experiencing paintings. The Guardian 22 (2015). https://www.theguardian.com/artanddesign/2015/aug/22/tate-sensorium-art-soundscapes-chocolates-invisible-rain
  14. Lina Eklund. 2020. A Shoe Is a Shoe Is a Shoe: Interpersonalization and Meaning-making in Museums – Research Findings and Design Implications. International Journal of Human-Computer Interaction 36, 16 (2020), 1503–1513. https://doi.org/10.1080/10447318.2020.1767982
  15. A Conceptual Human–Centered Approach to Immersive Digital Heritage Site/Museum Experiences: The Hidden Waterfall City. In Digital Heritage International Congress (DigitalHERITAGE) held jointly with 2018 24th International Conference on Virtual Systems & Multimedia (VSMM 2018). IEEE, 1–4. https://doi.org/10.1109/DigitalHeritage.2018.8810110
  16. John H Falk and Lynn D Dierking. 1992. The Museum Experience. Whalesback Books. https://books.google.co.jp/books?id=Hd9l6gt6aJ0C
  17. John H Falk and Lynn D Dierking. 2000. Learning from Museums: Visitor Experiences and the Making of Meaning. AltaMira Press.
  18. John H Falk and Lynn D Dierking. 2013. The Museum Experience Revisited (1st edition ed.). Routledge. https://doi.org/10.4324/9781315417851
  19. Living in a Learning Society: Museums and Free-choice Learning. A Companion to Museum Studies (2006), 323–339. https://doi.org/10.1002/9780470996836.ch19
  20. Natural Experiences in Museums through Virtual Reality and Voice Commands. In Proc. ACM MM. Association for Computing Machinery, New York, USA, 1233––1234. https://doi.org/10.1145/3123266.3127916
  21. Tell Me Where To Go: Voice-Controlled Hands-Free Locomotion for Virtual Reality Systems. In Proc. IEEE VR. IEEE, 123–134. https://doi.org/10.1109/VR55154.2023.00028
  22. Eva Hornecker and Luigina Ciolfi. 2019. Human-computer interactions in museums. Springer Cham. https://doi.org/10.1007/978-3-031-02225-8
  23. Douleur: Creating Pain Sensation with Chemical Stimulant to Enhance User Experience in Virtual Reality. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2, Article 66 (2021), 26 pages. https://doi.org/10.1145/3463527
  24. Development of a virtual museum including a 4D presentation of building history in virtual reality. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 42 (2017), 361–367. https://doi.org/10.5194/isprs-archives-XLII-2-W3-361-2017
  25. N. Levent and A. Pascual-Leone. 2014. The Multisensory Museum: Cross-Disciplinary Perspectives on Touch, Sound, Smell, Memory, and Space. Rowman & Littlefield Publishers. https://books.google.co.jp/books?id=c0sJAwAAQBAJ
  26. ChangYuan Li and BaiHui Tang. 2019. Research on Voice Interaction Technology in VR Environment. In International Conference on Electronic Engineering and Informatics (EEI). IEEE, 213–216. https://doi.org/10.1109/EEI48997.2019.00053
  27. CubeMuseum: An Augmented Reality Prototype of Embodied Virtual Museum. In IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE, 13–17. https://doi.org/10.1109/ISMAR-Adjunct54149.2021.00014
  28. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. Comput. Surveys 55, 9, Article 195 (2023), 35 pages. https://doi.org/10.1145/3560815
  29. Wandertroper: Supporting Aesthetic Engagement with Everyday Surroundings through Soundscape Augmentation. In Proceedings of the 15th International Conference on Mobile and Ubiquitous Multimedia (MUM ’16). Association for Computing Machinery, 129–140. https://doi.org/10.1145/3012709.3012725
  30. NVIDIA Corporation. 2023. Create XR Experiences Using Natural-Language Voice Commands: Test Project Mellon. https://developer.nvidia.com/blog/creating-xr-experiences-using-natural-language-voice-commands-test-project-mellon/. Accessed: 2023-12-01.
  31. Heather L O’Brien and Elaine G Toms. 2010. The development and evaluation of a survey to measure user engagement. Journal of the American Society for Information Science and Technology 61, 1 (2010), 50–69. https://doi.org/10.1002/asi.21229
  32. Marianna Obrist. 2017. Mastering the Senses in HCI: Towards Multisensory Interfaces. In Proceedings of the 12th Biannual Conference on Italian SIGCHI Chapter (CHItaly ’17). Association for Computing Machinery, Article 2, 2 pages. https://doi.org/10.1145/3125571.3125603
  33. OpenAI. 2022. OpenAI: Introducing ChatGPT. https://openai.com/blog/chatgpt
  34. OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774
  35. Hunter Osking and John A Doucette. 2019. Enhancing Emotional Effectiveness of Virtual-Reality Experiences with Voice Control Interfaces. In Immersive Learning Research Network. Springer, 199–209. https://doi.org/10.1007/978-3-030-23089-0_15
  36. Generative Agents: Interactive Simulacra of Human Behavior. arXiv:2304.03442
  37. Alireza Gholinejad Pirbazari and Sina Kamali Tabrizi. 2022. RecorDIM of Iran’s Cultural Heritage Using an Online Virtual Museum, Considering the Coronavirus Pandemic. ACM Journal on Computing and Cultural Heritage (JOCCH) 15, 2 (2022), 1–14.
  38. Laia Pujol and Anna Lorente. 2014. The Virtual Museum: A Quest for the Standard Definition. Archaeology in the Digital Era 40 (2014), 40–48. https://doi.org/10.1017/9789048519590.005
  39. Evaluation of voice commands for mode change in virtual reality implant planning procedure. International Journal of Computer Assisted Radiology and Surgery 17, 11 (2022), 1981–1989. https://doi.org/10.1007/s11548-022-02685-1
  40. Deborah Richards. 2012. Agent-Based Museum and Tour Guides: Applying the State of the Art. In Proc. Australasian Conference on Interactive Entertainment: Playing the System. Association for Computing Machinery, New York, USA, Article 15, 9 pages. https://doi.org/10.1145/2336727.2336742
  41. Steps towards prompt-based creation of virtual worlds. arXiv:2211.05875
  42. Code Llama: Open Foundation Models for Code. arXiv:2308.12950
  43. Fostering Virtual Guide in Exhibitions. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services. Association for Computing Machinery, New York, USA, Article 48, 6 pages. https://doi.org/10.1145/3338286.3344395
  44. VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View. arXiv:2307.06082
  45. Werner Schweibenz. 2019. The virtual museum: An overview of its origins, concepts, and terminology. The Museum Review 4, 1 (2019), 1–29.
  46. LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action. In Conference on Robot Learning. PMLR, 492–504.
  47. Virtual Artifact: Enhancing museum exhibit using 3D virtual reality. In TRON Symposium (TRONSHOW). IEEE, 1–5. https://doi.org/10.23919/TRONSHOW.2017.8275078
  48. Virtual museums, a survey and some issues for consideration. Journal of Cultural Heritage 10, 4 (2009), 520–528. https://doi.org/10.1016/j.culher.2009.03.003
  49. Avatars as storytellers: Affective narratives in virtual museums. Personal and Ubiquitous Computing 24, 6 (2020), 829–841. https://doi.org/10.1007/s00779-019-01358-2
  50. Exploring the relationship between presence and enjoyment in a virtual museum. International Journal of Human-Computer Studies 68, 5 (2010), 243–253. https://doi.org/10.1016/j.ijhcs.2009.11.002
  51. Analysis of virtual museums in terms of design and perception of presence. Education and Information Technologies 28, 7 (2023), 8945–8973. https://doi.org/10.1007/s10639-022-11561-z
  52. Laia Pujol Tost and Maria Economou. 2007. Exploring the suitability of Virtual Reality interactivity for exhibitions through an integrated evaluation: The case of the Ename Museum. 4 (2007), 81–97.
  53. LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971
  54. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288
  55. Virtual museum space as the innovative tool for the student research practice. International Journal of Emerging Technologies in Learning (iJET) 16, 14 (2021), 213–231.
  56. An Approach to Facilitate Visitors’ Engagement with Contemporary Art in a Virtual Museum. In International Conference on Transdisciplinary Multispectral Modeling and Cooperation for the Preservation of Cultural Heritage. Springer, 207–217. https://doi.org/10.1007/978-3-031-20253-7_17
  57. RECBOT: Virtual Museum navigation through a Chatbot assistant and personalized Recommendations. In Adjunct Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization. Association for Computing Machinery, 388–396. https://doi.org/10.1145/3563359.3596661
  58. Not just seeing, but also feeling art: Mid-air haptic experiences integrated in a multisensory art exhibition. International Journal of Human-Computer Studies 108 (2017), 1–14. https://doi.org/10.1016/j.ijhcs.2017.06.004
  59. Podoportation: Foot-Based Locomotion in Virtual Reality. In Proc. ACM CHI. Association for Computing Machinery, New York, USA, 1–14. https://doi.org/10.1145/3313831.3376626
  60. Annika Waern and Anders Sundnes Løvlie. 2022. Hybrid Museum Experiences: Theory and Design. Amsterdam University Press. https://doi.org/10.5117/9789463726443
  61. Enabling Conversational Interaction with Mobile UI Using Large Language Models. In Proc. ACM CHI. Springer, Article 432, 17 pages. https://doi.org/10.1145/3544548.3580895
  62. Virtual Museum ‘Takeouts’ and DIY Exhibitions-Augmented Reality Apps for Scholarship, Citizen Science and Public Engagement. In Euro-Mediterranean Conference. Springer, 323–333. https://doi.org/10.1007/978-3-030-73043-7_27
  63. Virtual Agents in Immersive Virtual Reality Environments: Impact of Humanoid Avatars and Output Modalities on Shopping Experience. International Journal of Human–Computer Interaction 0, 0 (2023), 1–23. https://doi.org/10.1080/10447318.2023.2241293
  64. The Invisible Museum: A User-Centric Platform for Creating Virtual 3D Exhibitions with VR Support. Electronics 10, 3 (2021), 363. https://doi.org/10.3390/electronics10030363
  65. Value-based model of user interaction design for virtual museum. CCF Transactions on Pervasive Computing and Interaction 3, 2 (2021), 112–128. https://doi.org/10.1007/s42486-021-00061-7
Citations (9)

Summary

  • The paper introduces a multi-modal framework where domain-specific LLMs transform user inquiries into context-aware guidance for virtual museum tours.
  • The paper employs a two-stage methodology that classifies user contexts and generates task-specific, interactive feedback across diverse modalities.
  • The paper validates VirtuWander’s effectiveness through thematic, single artwork, and personalized tour examples in a user study within a simulated virtual museum.

The paper "VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through LLMs" introduces VirtuWander, an LLM empowered system designed to enhance multi-modal interactions for virtual tour guidance in virtual museums. The authors address the limitations of current virtual museum tour guidance systems, which often lack the flexibility to accommodate diverse user needs and personalized preferences. The core contribution lies in a multi-modal interaction design framework that leverages domain-oriented LLMs to transform user inquiries into context-aware guidance, facilitating a more engaging and personalized virtual museum experience.

The authors begin by highlighting the increasing interest in virtual museums, driven by advancements in AR (Augmented Reality) and VR (Virtual Reality) technologies. They note that while virtual museums offer advantages over physical museums, such as increased accessibility and flexibility, designing effective tour guidance remains challenging. Existing approaches often rely on predefined routes and pre-written commentary, resulting in constrained interactions and limited personalization.

To address these limitations, the authors conduct a formative paper to characterize guidance-seeking contexts in virtual museums. This paper involves interviewing users about their guidance needs across various touring scenarios. The contexts are categorized based on when users require guidance, what implicit environmental information is necessary, and the specific guidance they seek.

Based on the formative paper, the authors establish a comprehensive framework comprising seven primary multi-modal guidance modalities that users generally anticipate LLMs to facilitate: avatars, voice assistance, text windows, minimaps, signposts, highlights, and virtual screens. This framework informs the design of VirtuWander, a two-stage system that leverages a pack-of-bots strategy, with each LLM-based chatbot embellished with domain-specific knowledge.

The first stage involves context identification, where user inquiries are classified and compiled to extract relevant information. The second stage focuses on feedback generation, where task-specific LLM responses are generated based on the identified context. This two-stage approach enables VirtuWander to provide personalized and context-aware guidance through various multi-modal feedback mechanisms.

The authors demonstrate the feasibility and versatility of VirtuWander through three virtual guiding examples: a thematic tour exploration, a single artwork exploration, and a personal tour customization. These examples showcase the system's ability to encompass diverse tour contexts and address personalized user requirements.

A user paper is conducted in a simulated virtual museum to evaluate the effectiveness of VirtuWander. The results suggest that the system enhances engaging virtual tour experiences through personalized communication and knowledgeable assistance. Participants interacted with different modalities, including voice, avatar, text window, highlight, and virtual screen, with the virtual screen being particularly well-received.

The authors also discuss design implications for future tour guidance systems, emphasizing the need for enriched input modalities, a combination of active and passive feedback, support for natural and directive communication styles, customized information granularity, and ensured information accuracy. They acknowledge the challenges of extending VirtuWander to real-world scenarios, including data collection, feedback presentation, privacy concerns, and adaptability.

In summary, the paper makes the following key contributions:

  • A design framework for LLM-empowered multi-modal feedback to enhance various tour contexts with interactive guidance, derived from a formative paper.
  • VirtuWander, a voice-controlled prototype demonstrating five interaction designs within a simulated virtual museum, incorporating a two-stage strategy for bridging user input and multi-modal feedback.
  • An evaluation of LLM-enhanced multi-modal interactions for guided tour experiences through showcases and a user paper, highlighting capabilities, potential, and limitations.
Youtube Logo Streamline Icon: https://streamlinehq.com