Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Visual Grounding of Learned Physical Models (2004.13664v2)

Published 28 Apr 2020 in cs.LG, cs.CV, and stat.ML

Abstract: Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions. The abilities to perform physical reasoning and to adapt to new environments, while intrinsic to humans, remain challenging to state-of-the-art computational models. In this work, we present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors. The visual prior predicts a particle-based representation of the system from visual observations. An inference module operates on those particles, predicting and refining estimates of particle locations, object states, and physical parameters, subject to the constraints imposed by the dynamics prior, which we refer to as visual grounding. We demonstrate the effectiveness of our method in environments involving rigid objects, deformable materials, and fluids. Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yunzhu Li (56 papers)
  2. Toru Lin (9 papers)
  3. Kexin Yi (9 papers)
  4. Daniel M. Bear (7 papers)
  5. Daniel L. K. Yamins (26 papers)
  6. Jiajun Wu (249 papers)
  7. Joshua B. Tenenbaum (257 papers)
  8. Antonio Torralba (178 papers)
Citations (74)