Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DMotion: Robotic Visuomotor Control with Unsupervised Forward Model Learned from Videos (2103.04301v3)

Published 7 Mar 2021 in cs.RO

Abstract: Learning an accurate model of the environment is essential for model-based control tasks. Existing methods in robotic visuomotor control usually learn from data with heavily labelled actions, object entities or locations, which can be demanding in many cases. To cope with this limitation, we propose a method, dubbed DMotion, that trains a forward model from video data only, via disentangling the motion of controllable agent to model the transition dynamics. An object extractor and an interaction learner are trained in an end-to-end manner without supervision. The agent's motions are explicitly represented using spatial transformation matrices containing physical meanings. In the experiments, DMotion achieves superior performance on learning an accurate forward model in a Grid World environment, as well as a more realistic robot control environment in simulation. With the accurate learned forward models, we further demonstrate their usage in model predictive control as an effective approach for robotic manipulations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Haoqi Yuan (11 papers)
  2. Ruihai Wu (28 papers)
  3. Andrew Zhao (28 papers)
  4. Haipeng Zhang (32 papers)
  5. Zihan Ding (38 papers)
  6. Hao Dong (175 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.