Papers
Topics
Authors
Recent
Search
2000 character limit reached

Practicing a Second Language Without Fear: Mixed Reality Agents for Interactive Group Conversation

Published 9 Oct 2025 in cs.HC | (2510.08227v1)

Abstract: Developing speaking proficiency in a second language can be cognitively demanding and emotionally taxing, often triggering fear of making mistakes or being excluded from larger groups. While current learning tools show promise for speaking practice, most focus on dyadic, scripted scenarios, limiting opportunities for dynamic group interactions. To address this gap, we present ConversAR, a Mixed Reality system that leverages Generative AI and XR to support situated and personalized group conversations. It integrates embodied AI agents, scene recognition, and generative 3D props anchored to real-world surroundings. Based on a formative study with experts in language acquisition, we developed and tested this system with a user study with 21 second-language learners. Results indicate that the system enhanced learner engagement, increased willingness to communicate, and offered a safe space for speaking. We discuss the implications for integrating Generative AI and XR into the design of future language learning applications.

Summary

  • The paper introduces ConversAR, a system that leverages AI-driven embodied agents and mixed reality to facilitate personalized group language practice.
  • It employs scene recognition and dynamic 3D prop generation to ground conversations in learners' real environments, enhancing engagement and contextual understanding.
  • Evaluation results indicate high usability and communicative effectiveness, while also revealing challenges in fine-tuning corrective feedback for diverse proficiency levels.

ConversAR: Mixed Reality Agents for Interactive Language Learning

The paper "Practicing a Second Language Without Fear: Mixed Reality Agents for Interactive Group Conversation" (2510.08227) introduces ConversAR, a system leveraging Mixed Reality (MR) and Generative AI to support personalized group conversations in language learning. By incorporating AI-driven embodied agents, the system aims to bridge the gap in traditional language learning applications by offering dynamic group interactions contextualized in the learner's physical environment.

System Architecture

ConversAR integrates several advanced technologies to create a robust language learning environment:

  • Embodied AI Agents: These agents participate in conversations tailored to the learner's proficiency and interests, providing real-time corrective feedback. The agents mimic realistic interpersonal interactions, crucial for building confidence and fluency in learners.
  • Scene Recognition: Leveraging MR, the system identifies real-world objects within the learner's environment, grounding conversations in tangible contexts.
  • Generative AI for 3D Props: The system dynamically generates 3D digital props that align with realia-based pedagogical theory, serving as conversational anchors to deepen engagement and language use. Figure 1

    Figure 1: ConversAR enables second language learners to engage in group conversations with embodied AI agents tailored to their proficiency level and personal interests.

Interaction Flow

ConversAR orchestrates the interaction through several sequential phases:

  1. Language Proficiency Assessment: Initial one-on-one dialogues assess the learner’s proficiency and interests, forming a foundational profile to tailor subsequent interactions. Figure 2

    Figure 2: ConversAR interaction flow illustrating system assessment of language skills, and dynamic conversation initiation.

  2. Group Conversation with AI Agents: Following assessment, learners engage in group discussions with two AI agents. These conversations leverage real-world objects and personalized context to encourage participation.
  3. Real-time Feedback and Contextualization: The agents adapt their corrective strategies—including recasts and clarification requests—based on the user’s responses, ensuring instructional efficacy while maintaining conversational flow. Figure 3

    Figure 3: System Overview of ConversAR, showcasing adaptive group conversation referencing physical objects and personalized digital props.

Evaluation Metrics

The empirical evaluation involved 21 second-language learners, with notable metrics gathered from standardized surveys (e.g., CETI, SUS, NASA-TLX):

  • Communicative Effectiveness: Participants reported high communicative effectiveness in structured interactions, although challenges remained in emotion expression. Figure 4

    Figure 4: Communicative Effectiveness Index (CETI) Survey Results illustrating user confidence in various communication tasks.

  • Usability and Engagement: The system usability survey (SUS) indicated high ease-of-use scores, demonstrating ConversAR's intuitive design facilitating language practice. Figure 5

    Figure 5: System Usability Survey Results, indicating overall positive user feedback in ease of use and engagement.

Discussion and Implications

ConversAR's distinct approach showcases promising implications for the future of language learning:

  • Personalization and Engagement: The use of AI-driven personalization to match learner preferences is crucial for maintaining engagement and promoting willingness to communicate.
  • Contextual Learning: Grounding conversations in a learner’s environment provides authentic learning experiences, enhancing retention and applicability of language skills.
  • Challenges in Feedback Implementation: While corrective feedback is a valuable component, tuning the feedback to avoid overwhelming beginners and to match the learner’s level remains challenging.

Future directions for ConversAR development might explore increased customization of agent personalities to enhance learner-agent affinity and improve engagement. Additionally, refining 3D object generation to better match the conversational context could enhance interactivity and conversational depth.

Conclusion

The ConversAR system represents a significant advancement in MR-based language learning applications, providing learners with realistic and context-rich environments for language practice. By addressing traditional limitations in language education technology, ConversAR sets a foundation for future research and development in adaptive AI-driven educational tools. Through thoughtful integration of context, personalization, and immersive interaction, ConversAR has the potential to reshape how learners engage with language acquisition in dynamic and meaningful ways.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.