Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Grounding Object Relations in Language-Conditioned Robotic Manipulation with Semantic-Spatial Reasoning (2303.17919v1)

Published 31 Mar 2023 in cs.RO

Abstract: Grounded understanding of natural language in physical scenes can greatly benefit robots that follow human instructions. In object manipulation scenarios, existing end-to-end models are proficient at understanding semantic concepts, but typically cannot handle complex instructions involving spatial relations among multiple objects. which require both reasoning object-level spatial relations and learning precise pixel-level manipulation affordances. We take an initial step to this challenge with a decoupled two-stage solution. In the first stage, we propose an object-centric semantic-spatial reasoner to select which objects are relevant for the language instructed task. The segmentation of selected objects are then fused as additional input to the affordance learning stage. Simply incorporating the inductive bias of relevant objects to a vision-language affordance learning agent can effectively boost its performance in a custom testbed designed for object manipulation with spatial-related language instructions.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Qian Luo (13 papers)
  2. Yunfei Li (30 papers)
  3. Yi Wu (171 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.