Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection (2310.09069v1)

Published 13 Oct 2023 in cs.RO and cs.AI

Abstract: In the realm of future home-assistant robots, 3D articulated object manipulation is essential for enabling robots to interact with their environment. Many existing studies make use of 3D point clouds as the primary input for manipulation policies. However, this approach encounters challenges due to data sparsity and the significant cost associated with acquiring point cloud data, which can limit its practicality. In contrast, RGB images offer high-resolution observations using cost effective devices but lack spatial 3D geometric information. To overcome these limitations, we present a novel image-based robotic manipulation framework. This framework is designed to capture multiple perspectives of the target object and infer depth information to complement its geometry. Initially, the system employs an eye-on-hand RGB camera to capture an overall view of the target object. It predicts the initial depth map and a coarse affordance map. The affordance map indicates actionable areas on the object and serves as a constraint for selecting subsequent viewpoints. Based on the global visual prior, we adaptively identify the optimal next viewpoint for a detailed observation of the potential manipulation success area. We leverage geometric consistency to fuse the views, resulting in a refined depth map and a more precise affordance map for robot manipulation decisions. By comparing with prior works that adopt point clouds or RGB images as inputs, we demonstrate the effectiveness and practicality of our method. In the project webpage (https://sites.google.com/view/imagemanip), real world experiments further highlight the potential of our method for practical deployment.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Xiaoqi Li (77 papers)
  2. Yanzi Wang (5 papers)
  3. Yan Shen (30 papers)
  4. Ponomarenko Iaroslav (1 paper)
  5. Haoran Lu (20 papers)
  6. Qianxu Wang (5 papers)
  7. Boshi An (6 papers)
  8. Jiaming Liu (156 papers)
  9. Hao Dong (175 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.