Gaze-based, Context-aware Robotic System for Assisted Reaching and Grasping (1809.08095v2)

Published 21 Sep 2018 in cs.RO and cs.HC

Abstract: Assistive robotic systems endeavour to support those with movement disabilities, enabling them to move again and regain functionality. Main issue with these systems is the complexity of their low-level control, and how to translate this to simpler, higher level commands that are easy and intuitive for a human user to interact with. We have created a multi-modal system, consisting of different sensing, decision making and actuating modalities, leading to intuitive, human-in-the-loop assistive robotics. The system takes its cue from the user's gaze, to decode their intentions and implement low-level motion actions to achieve high-level tasks. This results in the user simply having to look at the objects of interest, for the robotic system to assist them in reaching for those objects, grasping them, and using them to interact with other objects. We present our method for 3D gaze estimation, and grammars-based implementation of sequences of action with the robotic system. The 3D gaze estimation is evaluated with 8 subjects, showing an overall accuracy of $4.68\pm0.14cm$. The full system is tested with 5 subjects, showing successful implementation of $100\%$ of reach to gaze point actions and full implementation of pick and place tasks in 96\%, and pick and pour tasks in $76\%$ of cases. Finally we present a discussion on our results and what future work is needed to improve the system.

Authors (3)

Ali Shafti (17 papers)
Pavel Orlov (7 papers)
A. Aldo Faisal (39 papers)

Citations (50)

View on Semantic Scholar

Summary

The paper demonstrates that integrating eye-tracking with a ROS-based framework achieves precise 3D gaze estimation with a mean error of 4.68±0.14 cm.
It outlines a multi-component system combining eye-tracking glasses, RGB-D cameras, CNN-driven object recognition, and robotic actuators for seamless interaction.
The evaluation reports 100% reaching success and 76–96% success in combined tasks, highlighting robust performance and areas for future improvement.

Gaze-based, Context-aware Robotic System for Assisted Reaching and Grasping

The paper presents a novel assistive robotic system that leverages a gaze-based, context-aware approach to facilitate reaching and grasping tasks. This system aims to enhance the capabilities of individuals with upper limb disabilities by providing an intuitive and easy-to-use interface that relies on eye-tracking signals to infer user intentions and execute corresponding robotic actions. The authors introduce a multi-component framework involving eye-tracking glasses, a depth camera, object recognition modules, and robotic actuators to achieve these objectives.

In terms of system design, the authors employ a combination of technologies to address the challenges of intuitive control in assistive robotics. The system integrates commercial eye-tracking glasses with an RGB-D camera to obtain positional information, which is crucial for accurate gaze estimation in three dimensions (3D). This setup allows the system to decode user intentions from gaze patterns and seamlessly initiate robot-assisted actions such as reaching for and grasping objects. Using eye-tracking as an interface is beneficial since it is non-invasive, requires minimal learning time, and remains viable for most users with movement disabilities.

The underlying architecture leverages Robot Operating System (ROS) for real-time data integration and control. Key components include a convolutional neural network for object recognition, an optical head-tracking system for precise user positioning, a Universal Robots UR10 for reach support, and the BioServo Carbonhand for grasp assistance. The interaction between these components is managed using a Finite State Machine (FSM), which enables context-aware execution of predefined sequences of actions based on the user's current gaze and interaction state.

The authors present compelling evaluation metrics indicating the robustness and accuracy of their gaze estimation method. Conducted with eight subjects, the system demonstrated a mean Euclidean error of 4.68±0.14 cm, attesting to its precision in converting gaze data into actionable 3D coordinates. Additionally, a proof-of-concept evaluation showed high success rates in complex task completion for reaching and grasping — 100% reaching success and 96% to 76% success in combined tasks of custom operations like "pick, pour, and place."

Despite its promising results, the system faced certain limitations, particularly in mechanical design elements related to pouring actions. The experimental outcomes indicate that physical alterations to the wrist attachment could enhance system stability for fluid handling tasks, suggesting avenues for future refinements. Another area identified for improvement is user feedback and system explainability, which could enhance user confidence and predictability during operation.

The work posits several significant implications in the field of human-robot interaction, highlighting potential applications beyond assistive robotics, such as collaborative robotics, social robotics, and autonomous vehicles. The conceptual framework of using action grammars for intuitive, gaze-driven robotic control offers a scalable and adaptable model for various robotic applications that require human oversight or interaction.

In conclusion, the proposed gaze-based, context-aware robotic system represents a sophisticated and user-centered approach to assistive technology. By harnessing the natural modality of gaze and combining it with cutting-edge robotic and computer vision technologies, the authors present a compelling solution that prioritizes ease of use and accessibility. Future research could focus on expanding the system's capabilities, improving mechanical design for specific tasks, and exploring new application domains to maximize the potential benefits of this innovative integration of gaze and robotics.

PDF Markdown

Related Papers

YouTube

Show All Videos