Provable Compositional Generalization for Object-Centric Learning (2310.05327v2)
Abstract: Learning representations that generalize to novel compositions of known concepts is crucial for bridging the gap between human and machine perception. One prominent effort is learning object-centric representations, which are widely conjectured to enable compositional generalization. Yet, it remains unclear when this conjecture will be true, as a principled theoretical or empirical understanding of compositional generalization is lacking. In this work, we investigate when compositional generalization is guaranteed for object-centric representations through the lens of identifiability theory. We show that autoencoders that satisfy structural assumptions on the decoder and enforce encoder-decoder consistency will learn object-centric representations that provably generalize compositionally. We validate our theoretical result and highlight the practical relevance of our assumptions through experiments on synthetic image data.
- How to grow a mind: Statistics, structure, and abstraction. Science, 331:1279 – 1285, 2011.
- What is a cognitive map? organizing knowledge for flexible behavior. Neuron, 100(2):490–509, 2018. ISSN 0896-6273. doi: https://doi.org/10.1016/j.neuron.2018.10.002.
- Toward causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
- Connectionism and cognitive architecture: A critical analysis. Cognition, 28:3–71, 1988.
- Building machines that learn and think like people. Behavioral and Brain Sciences, 40:e253, 2017. doi: 10.1017/S0140525X16001837.
- Relational inductive biases, deep learning, and graph networks. ArXiv, abs/1806.01261, 2018.
- Inductive biases for deep learning of higher-level cognition. Proceedings of the Royal Society A, 478(2266):20210068, 2022.
- On the binding problem in artificial neural networks. arXiv preprint arXiv:2012.05208, 2020.
- MONet: Unsupervised Scene Decomposition and Representation, January 2019.
- Multi-object representation learning with iterative variational inference. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 2424–2433, 2019.
- Object-Centric Learning with Slot Attention. In Advances in Neural Information Processing Systems, volume 33, pages 11525–11538. Curran Associates, Inc., 2020a.
- Space: Unsupervised object-oriented scene representation via spatial attention and decomposition. In International Conference on Learning Representations, 2020.
- Illiterate DALL-e learns to compose. In International Conference on Learning Representations, 2022.
- Savi++: Towards end-to-end object-centric learning from real-world videos. Advances in Neural Information Processing Systems, 35:28940–28954, 2022.
- Bridging the gap to real-world object-centric learning. In The Eleventh International Conference on Learning Representations, 2023.
- Contrastive learning of structured world models. In International Conference on Learning Representations, 2020.
- Nonlinear independent component analysis for principled disentanglement in unsupervised deep learning. ArXiv, abs/2303.16535, 2023.
- Provably learning object-centric representations. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 3038–3062. PMLR, 23–29 Jul 2023.
- Toward compositional generalization in object-oriented world modeling. In International Conference on Machine Learning, pages 26841–26864. PMLR, 2022.
- Learning and generalization of compositional representations of visual scenes. arXiv preprint arXiv:2303.13691, 2023.
- Compositional generalization from first principles. arXiv preprint arXiv:2307.05596, 2023.
- Compositional scene representation learning via reconstruction: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Nonlinear independent component analysis: Existence and uniqueness results. Neural Networks, 12(3):429–439, 1999. ISSN 0893-6080.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 4114–4124, 2019.
- The role of disentanglement in generalisation. In International Conference on Learning Representations, 2021.
- Lost in latent space: Examining failures of disentangled models at combinatorial generalisation. In Advances in Neural Information Processing Systems, volume 35, pages 10136–10149. Curran Associates, Inc., 2022.
- Visual representation learning does not generalize strongly within the same domain. In International Conference on Learning Representations, 2022.
- H. W. Kuhn. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2):83–97, March 1955. ISSN 00281441, 19319193. doi: 10.1002/nav.3800020109.
- Unsupervised feature extraction by time-contrastive learning and nonlinear ICA. In NIPS, pages 3765–3773, 2016.
- Nonlinear ICA of temporally dependent stationary sources. In AISTATS, volume 54 of Proceedings of Machine Learning Research, pages 460–469, 2017.
- Nonlinear ICA using auxiliary variables and generalized contrastive learning. In AISTATS, volume 89 of Proceedings of Machine Learning Research, pages 859–868, 2019.
- Variational autoencoders and nonlinear ICA: A unifying framework. In AISTATS, volume 108 of Proceedings of Machine Learning Research, pages 2207–2217, 2020a.
- Ice-beem: Identifiable conditional energy-based deep models based on nonlinear ICA. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020b.
- Weakly supervised disentanglement with guarantees. In ICLR, 2020.
- Weakly-supervised disentanglement without compromises. In ICML, volume 119 of Proceedings of Machine Learning Research, pages 6348–6359, 2020b.
- The incomplete rosetta stone problem: Identifiability results for multi-view nonlinear ICA. In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2019, Tel Aviv, Israel, July 22-25, 2019, volume 115 of Proceedings of Machine Learning Research, pages 217–227, 2019.
- Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ica. In First Conference on Causal Learning and Reasoning, 2021.
- Towards nonlinear disentanglement in natural data with temporal sparse coding. In ICLR, 2021.
- Disentangling identifiable features from noisy data with structured nonlinear ICA. In NeurIPS, pages 1624–1633, 2021.
- Self-supervised learning with data augmentations provably isolates content from style. In Advances in Neural Information Processing Systems, volume 34, pages 16451–16467, 2021.
- Causal component analysis. ArXiv, abs/2305.17225, 2023.
- Independent mechanism analysis, a new concept? Advances in Neural Information Processing Systems, 34:28233–28248, 2021.
- When is unsupervised disentanglement possible? In Advances in Neural Information Processing Systems, 2021.
- Identifiable deep generative models via sparse decoding. Transactions on Machine Learning Research, 2022. ISSN 2835-8856.
- Function classes for identifiable nonlinear independent component analysis. In NeurIPS, 2022.
- On the identifiability of nonlinear ICA: sparsity and beyond. In NeurIPS, 2022.
- Learning to Extrapolate: A Transductive Approach. In The Eleventh International Conference on Learning Representations, February 2023.
- First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains. In The Eleventh International Conference on Learning Representations, September 2022.
- Additive decoders for latent variables identification and cartesian-product extrapolation. arXiv preprint arXiv:2307.02598, 2023.
- Generative replay for compositional visual understanding in the prefrontal-hippocampal circuit. bioRxiv, 2021. doi: 10.1101/2021.06.06.447249.
- Replay and compositional computation. Neuron, 111:454–469, 2022.
- Constructing future behaviour in the hippocampal formation through composition and replay. bioRxiv, 2023. doi: 10.1101/2023.04.07.536053.
- Taming vaes. arXiv preprint arXiv:1810.00597, 2018.
- The autoencoding variational autoencoder. Advances in Neural Information Processing Systems, 33:15077–15087, 2020.
- Consistency regularization for variational auto-encoders. Advances in Neural Information Processing Systems, 34:12943–12954, 2021.
- Exploring the latent space of autoencoders with interventional assays. In Advances in Neural Information Processing Systems, 2022.
- Dreamcoder: growing generalizable, interpretable knowledge with wake–sleep bayesian program learning. Philosophical Transactions of the Royal Society A, 381(2251):20220050, 2023.
- Object-centric compositional imagination for visual abstract reasoning. In ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality, 2022.
- Spriteworld: A flexible, configurable reinforcement learning environment. https://github.com/deepmind/spriteworld/, 2019.
- Generalization and robustness implications in object-centric learning. In International Conference on Machine Learning, 2021.
- Understanding disentangling in β𝛽\betaitalic_β-vae, 2018.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
- Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, pages 8024–8035, 2019.
- Thaddäus Wiedemer (6 papers)
- Jack Brady (5 papers)
- Alexander Panfilov (8 papers)
- Attila Juhos (6 papers)
- Matthias Bethge (103 papers)
- Wieland Brendel (55 papers)