Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

What You See is What You Grasp: User-Friendly Grasping Guided by Near-eye-tracking (2209.06122v2)

Published 13 Sep 2022 in cs.RO

Abstract: This work presents a next-generation human-robot interface that can infer and realize the user's manipulation intention via sight only. Specifically, we develop a system that integrates near-eye-tracking and robotic manipulation to enable user-specified actions (e.g., grasp, pick-and-place, etc), where visual information is merged with human attention to create a mapping for desired robot actions. To enable sight guided manipulation, a head-mounted near-eye-tracking device is developed to track the eyeball movements in real-time, so that the user's visual attention can be identified. To improve the grasping performance, a transformer based grasp model is then developed. Stacked transformer blocks are used to extract hierarchical features where the volumes of channels are expanded at each stage while squeezing the resolution of feature maps. Experimental validation demonstrates that the eye-tracking system yields low gaze estimation error and the grasping system yields promising results on multiple grasping datasets. This work is a proof of concept for gaze interaction-based assistive robot, which holds great promise to help the elder or upper limb disabilities in their daily lives. A demo video is available at https://www.youtube.com/watch?v=yuZ1hukYUrM

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Shaochen Wang (10 papers)
  2. Wei Zhang (1492 papers)
  3. Zhangli Zhou (4 papers)
  4. Jiaxi Cao (2 papers)
  5. Ziyang Chen (91 papers)
  6. Kang Chen (61 papers)
  7. Bin Li (514 papers)
  8. Zhen Kan (24 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.