Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Visual Affordance Prediction for Guiding Robot Exploration (2305.17783v1)

Published 28 May 2023 in cs.RO and cs.AI

Abstract: Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning visual affordances for guiding robot exploration. Given an input image of a scene, we infer a distribution over plausible future states that can be achieved via interactions with it. We use a Transformer-based model to learn a conditional distribution in the latent embedding space of a VQ-VAE and show that these models can be trained using large-scale and diverse passive data, and that the learned models exhibit compositional generalization to diverse objects beyond the training distribution. We show how the trained affordance model can be used for guiding exploration by acting as a goal-sampling distribution, during visual goal-conditioned policy learning in robotic manipulation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Homanga Bharadhwaj (36 papers)
  2. Abhinav Gupta (178 papers)
  3. Shubham Tulsiani (71 papers)
Citations (11)

Summary

We haven't generated a summary for this paper yet.