Automated capture and delivery of assistive task guidance with an eyewear computer: The GlaciAR system (1701.02586v1)

Published 29 Dec 2016 in cs.HC

Abstract: In this paper we describe and evaluate a mixed reality system that aims to augment users in task guidance applications by combining automated and unsupervised information collection with minimally invasive video guides. The result is a self-contained system that we call GlaciAR (Glass-enabled Contextual Interactions for Augmented Reality), that operates by extracting contextual interactions from observing users performing actions. GlaciAR is able to i) automatically determine moments of relevance based on a head motion attention model, ii) automatically produce video guidance information, iii) trigger these video guides based on an object detection method, iv) learn without supervision from observing multiple users and v) operate fully on-board a current eyewear computer (Google Glass). We describe the components of GlaciAR together with evaluations on how users are able to use the system to achieve three tasks. We see this work as a first step toward the development of systems that aim to scale up the notoriously difficult authoring problem in guidance systems and where people's natural abilities are enhanced via minimally invasive visual guidance.

Authors (3)

Teesid Leelasawassuk (3 papers)
Dima Damen (83 papers)
Walterio Mayol-Cuevas (27 papers)

Citations (17)

View on Semantic Scholar

Summary

Automated Task Guidance with GlaciAR: An Overview

The paper "Automated capture and delivery of assistive task guidance with an eyewear computer: The GlaciAR system" presents a comprehensive account of the GlaciAR system, a mixed reality platform that utilizes Google Glass to provide unsupervised task guidance. This system, developed by researchers from the University of Bristol, is innovatively designed to capture and deliver task guidance through minimally invasive video snippets, leveraging head motion-driven attention detection. The paper offers a detailed exploration of the system's architecture, efficacy, and potential applications, marking a substantive contribution to augmented reality (AR) research.

System Architecture and Methodology

GlaciAR is fundamentally structured around three primary components: a model for user attention, video snippet recording, and object detection. The integration of these components enables the system to autonomously determine moments of user attention, capture relevant video snippets, and recognize objects upon inspection by novice users. A notable feature of GlaciAR is its reliance on Google's eyewear computer, Google Glass, as the sole hardware platform, underscoring the system's operational viability without the need for extensive computational resources or complex setups.

The attention model is particularly noteworthy, as it does not rely on eye-gaze tracking—a common method in similar systems—but instead employs IMU-derived head motion data. This design decision simplifies the system's requirements and ensures it is applicable to current commercial eyewear hardware. The paper reveals that GlaciAR's attention model can predict user interaction on average 1.18 seconds in advance, providing ample lead time to capture relevant video snippets.

Evaluation and Results

The paper systematically evaluates the system across various tasks, including using an oscilloscope, screwdriver, and sewing machine. Evaluation metrics such as task success rate, completion time, and user feedback were employed to gauge the system's performance. Notably, the success rates for task completion were uniformly high, reaching 100% for simpler tasks and over 85% for more involved procedures. These metrics demonstrate that the automatically captured and delivered videos were as effective as those manually curated by domain experts—a significant finding that validates the efficacy of GlaciAR's automated approach.

Implications and Future Directions

The implications of this research are twofold, affecting both theoretical and practical domains within AR. Theoretically, it extends the utility of head motion as an indicator of attention, a less explored but potent alternative to gaze tracking in AR interfaces. Practically, the system illustrates the feasibility of deploying AR guidance on commercially available hardware without extensive preprocessing or bespoke equipment. This approach potentially lowers barriers to entry for task guidance systems and makes widespread deployment more accessible.

Looking toward future developments, the findings suggest enhancements in overlaying information and workflow management within AR goggles. Participants commonly cited issues with the display size and clarity of video guidance, highlighting areas for refinement, possibly through integrated feedback mechanisms or enhanced display technologies.

In conclusion, the GlaciAR system exemplifies a significant step towards scalable, unsupervised AR task guidance. By automating the traditionally labor-intensive authoring process of guidance systems, it paves the way for broader adoption and refinement of AR technologies across various domains. The findings and methodologies outlined in the paper offer valuable insights and benchmarks for future research and development in AR systems. As AR technology continues to evolve, the principles demonstrated by GlaciAR provide a foundation for ongoing exploration and innovation.

PDF Markdown

Related Papers

YouTube

Show All Videos