Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Provably Learning Object-Centric Representations (2305.14229v1)

Published 23 May 2023 in cs.LG and cs.CV

Abstract: Learning structured representations of the visual world in terms of objects promises to significantly improve the generalization abilities of current machine learning models. While recent efforts to this end have shown promising empirical progress, a theoretical account of when unsupervised object-centric representation learning is possible is still lacking. Consequently, understanding the reasons for the success of existing object-centric methods as well as designing new theoretically grounded methods remains challenging. In the present work, we analyze when object-centric representations can provably be learned without supervision. To this end, we first introduce two assumptions on the generative process for scenes comprised of several objects, which we call compositionality and irreducibility. Under this generative process, we prove that the ground-truth object representations can be identified by an invertible and compositional inference model, even in the presence of dependencies between objects. We empirically validate our results through experiments on synthetic data. Finally, we provide evidence that our theory holds predictive power for existing object-centric models by showing a close correspondence between models' compositionality and invertibility and their empirical identifiability.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jack Brady (5 papers)
  2. Roland S. Zimmermann (17 papers)
  3. Yash Sharma (45 papers)
  4. Bernhard Schölkopf (412 papers)
  5. Julius von Kügelgen (42 papers)
  6. Wieland Brendel (55 papers)
Citations (28)

Summary

We haven't generated a summary for this paper yet.