XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI

Published 8 Apr 2026 in cs.CE, cs.AI, cs.CV, cs.CY, and cs.ET | (2604.06901v1)

Abstract: Conventional career guidance platforms rely on static, text-driven interfaces that struggle to engage users or deliver personalised, evidence-based insights. Although Computer-Assisted Career Guidance Systems have evolved since the 1960s, they remain limited in interactivity and pay little attention to the narrative dimensions of career development. We introduce XR-CareerAssist, a platform that unifies Extended Reality (XR) with several AI modules to deliver immersive, multilingual career guidance. The system integrates Automatic Speech Recognition for voice-driven interaction, Neural Machine Translation across English, Greek, French, and Italian, a Langchain-based conversational Training Assistant for personalised dialogue, a BLIP-based Vision-LLM for career visualisations, and AWS Polly Text-to-Speech delivered through an interactive 3D avatar. Career trajectories are rendered as dynamic Sankey diagrams derived from a repository of more than 100,000 anonymised professional profiles. The application was built in Unity for Meta Quest 3, with backend services hosted on AWS. A pilot evaluation at the University of Exeter with 23 participants returned 95.6% speech recognition accuracy, 78.3% overall user satisfaction, and 91.3% favourable ratings for system responsiveness, with feedback informing subsequent improvements to motion comfort, audio clarity, and text legibility. XR-CareerAssist demonstrates how the fusion of XR and AI can produce more engaging, accessible, and effective career development tools, with the integration of five AI modules within a single immersive environment yielding a multimodal interaction experience that distinguishes it from existing career guidance platforms.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces XR-CareerAssist, a novel immersive platform that integrates extended reality and multimodal AI for personalized career guidance.
The system employs a five-layer scalable architecture with real-time speech recognition (>95% accuracy) and rapid Sankey diagram generation (0.2 sec), ensuring dynamic, user-centered interaction.
Empirical pilot evaluation demonstrated high user satisfaction (over 78%) and robust backend performance, validating its potential for widespread institutional adoption.

Immersive, Multimodal Career Guidance: Overview of XR-CareerAssist

Introduction and Theoretical Foundations

"XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI" (2604.06901) addresses the enduring limitations of Computer-Assisted Career Guidance Systems (CACGS) by integrating Extended Reality (XR) and advanced multimodal Artificial Intelligence. Traditional CACGS approaches rely on static user interfaces and trait-factor matching paradigms, lacking both the engagement and personalization necessary for modern, dynamic career trajectories. This work draws on Career Construction Theory (CCT), shifting toward narrative, experiential, and user-centred forms of career development, and operationalizes these principles via real-time, interactive, and data-driven immersive technologies.

Multimodal System Architecture

XR-CareerAssist features a five-layer modular architecture designed for scalability, extensibility, and efficient multimodal data processing. The architecture interlinks: an XR user interface optimized for hands-free interaction via Meta Quest 3, an application layer orchestrating logic and session management, an integration layer for abstracting backend APIs and AI model calls, a services layer hosting all AI modules, and a data layer maintaining user profiles and profile-derived analytics.

Figure 1: XR-CareerAssist architecture with clearly delineated layers facilitating modular deployment and maintenance.

The system coordinates five principal AI components:

Automatic Speech Recognition (ASR): Real-time multilingual speech input processing with >95% accuracy.
Neural Machine Translation (NMT): Dynamic language support (English, Greek, French, Italian) for cross-lingual accessibility.
Conversational Agent: A context-aware dialogue system built with Langchain, integrating profile data for personalized guidance.
Vision-Language (VL) Model: BLIP-based and domain-finetuned for interpreting and answering user queries about Sankey diagram career visualizations.
Text-to-Speech (TTS): High-fidelity synthesis routed through a 3D avatar using AWS Polly, following initial PIPER evaluation.

User Experience and Multimodal Pipeline

The core user journey orchestrates seamless voice-driven input, dynamic multilingual translation, data retrieval, advanced visualization, and naturalistic dialogue output. The system abstracts technical complexity to the user, enabling intuitive engagement with sophisticated analytics and dialogue.

Figure 2: The complete user-AI interaction loop, from multimodal input to voice output via avatar.

Precision in speech recognition is maintained even under VR constraints, while rapid NMT enables accessibility for non-English users. Fine-tuned VL models extend BLIP's generalist vision-language capacity to specialized career data queries and Sankey diagram interpretation, supporting deep, contextually relevant guidance.

Figure 3: High-accuracy ASR and real-time NMT translation enable robust, language-agnostic voice interfaces within XR.

Integration of the personalized Sankey diagram generator further strengthens the narrative and exploratory aspects of the platform by grounding career path recommendations in a curated dataset of 100,000+ anonymized professional CVs.

Figure 4: VL model successfully answers complicated user queries on Sankey-based career transition patterns, confirming fine-tuned model competence.

Backend Optimizations and Performance

Extensive backend optimizations significantly reduce latency and support high concurrency. Sankey diagram generation time was reduced from 45 seconds to 0.2 seconds (99.56% reduction), meeting sub-second responsiveness criteria for interactive XR workloads. API endpoints are served using FastAPI with GPU-enabled AWS Elastic Beanstalk infrastructure, and employ DuckDB for fast in-memory analytics.

Figure 5: Under 1-second median latency and zero failure for 10,000 concurrent users demonstrate production-grade scalability.

These characteristics are essential for real-world institutional deployment in high-demand academic or workforce development contexts.

Data-Driven Visualizations and Personalized Pathways

Career progression visualizations are dynamically generated from user questionnaire input and analyzed against the CVCOSMOS profile database, yielding empirically grounded career and industry transition Sankey diagrams.

Figure 6: Sankey diagram for a high-experience profile, visualizing empirical transition probabilities across a 10-year horizon.

Figure 7: Role evolution map illustrates the probabilistic progression from employee to director roles within the cohort dataset.

Figure 8: Industry shift diagram quantifies inter-sector mobility trends indicated by career data analytics.

These outputs support granular, evidence-based career scenario exploration.

XR User Interface and Pilot Evaluation

Unity-based development delivers a VR-forward user interface, including responsive questionnaire flows and a high-visibility 3D avatar assistant. Iterative UI refinement addresses VR-specific usability, motion sickness, and accessibility constraints.

Figure 9: Dynamic, VR-optimized questionnaire component for collecting relevant profile data.

Figure 10: Immersive VR environment providing simultaneous voice assistant interaction and live career mapping.

A 23-participant pilot at the University of Exeter validated overall system usability, responsiveness, and perceived career guidance value. Notably, the ASR module achieved a 95.6% accuracy rating, and overall user satisfaction exceeded 78%. No failures were observed under pilot or load-testing.

Figure 11: Pilot hardware and environment schematic at University of Exeter.

Critical UX improvements post-pilot addressed comfort, clarity, safety, and text readability, demonstrating agile responsiveness to real-user feedback.

Implications, Limitations, and Future Work

XR-CareerAssist advances state of the art for CACGS through:

Deep multimodal AI integration within an immersive, empirically grounded interactive environment;
Demonstrated backend scalability and real-world pilot validation;
Facilitation of narrative, longitudinal career reflection consistent with CCT.

The system remains bounded by several constraints: limited language coverage, single-session dialogue memory, requirement for standalone VR hardware, and sample demographic skew in pilot validation. Ongoing work should prioritize expanded language/model support, persistent conversation histories, mobile/AR accessibility, and diversified, longitudinal effectiveness studies.

Conclusion

XR-CareerAssist delivers a production-scale framework for immersive, personalized career guidance, leveraging fine-tuned multimodal AI and rigorous data analytics in a VR environment. Performance metrics highlight significant advances in interactivity and responsiveness, while empirical user evaluation confirms system readiness for institutional adoption. The architecture sets a precedent for future XR-AI deployments in guidance, education, and workforce analytics, with the potential to extend this multimodal approach across diverse competency development and support systems.

Markdown Report Issue