Social Conjuring: Multi-User Runtime Collaboration with AI in Building Virtual 3D Worlds (2410.00274v2)

Published 30 Sep 2024 in cs.HC, cs.AI, cs.CL, and cs.ET

Abstract: Generative artificial intelligence has shown promise in prompting virtual worlds into existence, yet little attention has been given to understanding how this process unfolds as social interaction. We present Social Conjurer, a framework for AI-augmented dynamic 3D scene co-creation, where multiple users collaboratively build and modify virtual worlds in real-time. Through an expanded set of interactions, including social and tool-based engagements as well as spatial reasoning, our framework facilitates the creation of rich, diverse virtual environments. Findings from a preliminary user study (N=12) provide insight into the user experience of this approach, how social contexts shape the prompting of spatial environments, and perspective on social applications of prompt-based 3D co-creation. In addition to highlighting the potential of AI-supported multi-user world creation and offering new pathways for AI-augmented creative processes in VR, this article presents a set of implications for designing human-centered interfaces that incorporate AI models into 3D content generation.

Summary

The paper introduces Social Conjurer, a framework that integrates generative AI to enable dynamic, real-time 3D scene creation and multi-user collaboration.
The methodology employs CLIP embeddings and feedback-driven spatial reasoning, achieving 88.8% accuracy compared to 71.4% with text-only approaches.
The user study with 12 participants demonstrates enhanced interactivity and creativity, setting the stage for future AI-augmented VR collaboration.

The research paper, "Social Conjuring: Multi-User Runtime Collaboration with AI in Building Virtual 3D Worlds," presents a comprehensive framework dubbed Social Conjurer, designed to facilitate co-creation and modification of virtual worlds in real-time by leveraging generative AI. The authors detail the system at both architectural and functionality levels, exploring AI-augmented 3D scene creation, multi-user collaboration, and interaction paradigms within virtual environments. The research addresses notable gaps in current virtual worldbuilding paradigms, particularly the integration of AI to aid spontaneous creation without pre-defined limitations.

The architecture builds upon and extends the LLMR framework, incorporating novel components like environment generation, spatial reasoning, and asset retrieval. The system dynamically decides whether a prompt requires static scene generation or interactive script creation, optimizing responsiveness while enhancing application scope. A significant contribution of the work is the proficiency in spatial reasoning, achieved via a sophisticated feedback-driven module that iteratively adapts asset placement and orientation based on user-specified prompts. Utilizing CLIP embeddings allows for enhanced semantic search during asset retrieval, addressing the variability and complexity inherent in user inputs.

Experimentally, the system's efficacy is substantiated through a user paper with 12 participants, examining both technical performance and user interaction within shared virtual spaces. The paper's exploration into real-time collaboration reveals how dual-device and multi-user contexts shape user experience, offering insights into effective collaboration and areas necessitating further refinement. Remarkably, numeric results indicate that the system's spatial reasoning surpasses traditional text-based methods in accuracy—achieving 88.8% with visual feedback compared to 71.4% with text-only baselines.

The paper asserts implications for future research and practical developments in AI-augmented VR. The intrinsic unpredictability in AI interpretations, deemed "controlled randomness," emerges as a potential boon for creativity despite occasional divergences from user intent. This spontaneity drives home the importance of fostering creativity by allowing AI some latitude, albeit balanced with control mechanisms to ensure user satisfaction.

Importantly, the research situates its relevance within broader discourses on multi-user collaborative frameworks, tasking future systems with exploring issues around object ownership, system transparency, and ethical design considerations. As AI capabilities continue to evolve, systems akin to Social Conjurer will likely gravitate toward more nuanced, context-aware interactions, enhancing user empowerment while respecting a diverse array of accessibility needs and ethical frameworks.

In essence, the work illuminates not only the technical intricacies involved in runtime 3D world generation but also the nuanced human-AI interactions that enrich collaborative virtual environments. As AI becomes further entrenched in VR development, such explorations will be imperative in sculpting a future where digital co-creativity flourishes uninhibited.

PDF Markdown

Related Papers

Tweets

https://twitter.com/PromptSorcerer/status/1841429476892164360

https://twitter.com/kristileilani/status/1842018815179387167

Social Conjuring: Multi-User Runtime Collaboration with AI in Building Virtual 3D Worlds (2410.00274v2)

Summary

An Overview of "Social Conjuring: Multi-User Runtime Collaboration with AI in Building Virtual 3D Worlds"

Related Papers

Tweets