VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through Large Language Models (2401.11923v2)

Published 22 Jan 2024 in cs.HC

Abstract: Tour guidance in virtual museums encourages multi-modal interactions to boost user experiences, concerning engagement, immersion, and spatial awareness. Nevertheless, achieving the goal is challenging due to the complexity of comprehending diverse user needs and accommodating personalized user preferences. Informed by a formative study that characterizes guidance-seeking contexts, we establish a multi-modal interaction design framework for virtual tour guidance. We then design VirtuWander, a two-stage innovative system using domain-oriented LLMs to transform user inquiries into diverse guidance-seeking contexts and facilitate multi-modal interactions. The feasibility and versatility of VirtuWander are demonstrated with virtual guiding examples that encompass various touring scenarios and cater to personalized preferences. We further evaluate VirtuWander through a user study within an immersive simulated museum. The results suggest that our system enhances engaging virtual tour experiences through personalized communication and knowledgeable assistance, indicating its potential for expanding into real-world scenarios.

References (65)

Citations (9)

View on Semantic Scholar

Summary

The paper introduces a multi-modal framework where domain-specific LLMs transform user inquiries into context-aware guidance for virtual museum tours.
The paper employs a two-stage methodology that classifies user contexts and generates task-specific, interactive feedback across diverse modalities.
The paper validates VirtuWander’s effectiveness through thematic, single artwork, and personalized tour examples in a user study within a simulated virtual museum.

The paper "VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through LLMs" introduces VirtuWander, an LLM empowered system designed to enhance multi-modal interactions for virtual tour guidance in virtual museums. The authors address the limitations of current virtual museum tour guidance systems, which often lack the flexibility to accommodate diverse user needs and personalized preferences. The core contribution lies in a multi-modal interaction design framework that leverages domain-oriented LLMs to transform user inquiries into context-aware guidance, facilitating a more engaging and personalized virtual museum experience.

The authors begin by highlighting the increasing interest in virtual museums, driven by advancements in AR (Augmented Reality) and VR (Virtual Reality) technologies. They note that while virtual museums offer advantages over physical museums, such as increased accessibility and flexibility, designing effective tour guidance remains challenging. Existing approaches often rely on predefined routes and pre-written commentary, resulting in constrained interactions and limited personalization.

To address these limitations, the authors conduct a formative paper to characterize guidance-seeking contexts in virtual museums. This paper involves interviewing users about their guidance needs across various touring scenarios. The contexts are categorized based on when users require guidance, what implicit environmental information is necessary, and the specific guidance they seek.

Based on the formative paper, the authors establish a comprehensive framework comprising seven primary multi-modal guidance modalities that users generally anticipate LLMs to facilitate: avatars, voice assistance, text windows, minimaps, signposts, highlights, and virtual screens. This framework informs the design of VirtuWander, a two-stage system that leverages a pack-of-bots strategy, with each LLM-based chatbot embellished with domain-specific knowledge.

The first stage involves context identification, where user inquiries are classified and compiled to extract relevant information. The second stage focuses on feedback generation, where task-specific LLM responses are generated based on the identified context. This two-stage approach enables VirtuWander to provide personalized and context-aware guidance through various multi-modal feedback mechanisms.

The authors demonstrate the feasibility and versatility of VirtuWander through three virtual guiding examples: a thematic tour exploration, a single artwork exploration, and a personal tour customization. These examples showcase the system's ability to encompass diverse tour contexts and address personalized user requirements.

A user paper is conducted in a simulated virtual museum to evaluate the effectiveness of VirtuWander. The results suggest that the system enhances engaging virtual tour experiences through personalized communication and knowledgeable assistance. Participants interacted with different modalities, including voice, avatar, text window, highlight, and virtual screen, with the virtual screen being particularly well-received.

The authors also discuss design implications for future tour guidance systems, emphasizing the need for enriched input modalities, a combination of active and passive feedback, support for natural and directive communication styles, customized information granularity, and ensured information accuracy. They acknowledge the challenges of extending VirtuWander to real-world scenarios, including data collection, feedback presentation, privacy concerns, and adaptability.

In summary, the paper makes the following key contributions:

A design framework for LLM-empowered multi-modal feedback to enhance various tour contexts with interactive guidance, derived from a formative paper.
VirtuWander, a voice-controlled prototype demonstrating five interaction designs within a simulated virtual museum, incorporating a two-stage strategy for bridging user input and multi-modal feedback.
An evaluation of LLM-enhanced multi-modal interactions for guided tour experiences through showcases and a user paper, highlighting capabilities, potential, and limitations.

PDF Markdown

YouTube

Show All Videos

VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through Large Language Models (2401.11923v2)

Summary

Related Papers

YouTube