Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Compositional Generalization from First Principles (2307.05596v1)

Published 10 Jul 2023 in cs.LG and stat.ML

Abstract: Leveraging the compositional nature of our world to expedite learning and facilitate generalization is a haLLMark of human perception. In machine learning, on the other hand, achieving compositional generalization has proven to be an elusive goal, even for models with explicit compositional priors. To get a better handle on compositional generalization, we here approach it from the bottom up: Inspired by identifiable representation learning, we investigate compositionality as a property of the data-generating process rather than the data itself. This reformulation enables us to derive mild conditions on only the support of the training distribution and the model architecture, which are sufficient for compositional generalization. We further demonstrate how our theoretical framework applies to real-world scenarios and validate our findings empirically. Our results set the stage for a principled theoretical study of compositional generalization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1):3–71, March 1988. ISSN 0010-0277. doi: 10.1016/0010-0277(88)90031-5.
  2. Noam Chomsky. Aspects of the Theory of Syntax, volume 11. MIT press, 2014.
  3. Visual Representation Learning Does Not Generalize Strongly Within the Same Domain. arXiv:2107.08221 [cs], February 2022.
  4. The role of Disentanglement in Generalisation. In International Conference on Learning Representations, February 2022a.
  5. Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In International Conference on Machine Learning, pages 2873–2882. PMLR, 2018.
  6. Rearranging the familiar: Testing compositional generalization in recurrent networks. arXiv preprint arXiv:1807.07545, 2018.
  7. Measuring compositional generalization: A comprehensive method on realistic data. arXiv preprint arXiv:1912.09713, 2019.
  8. On the fairness of disentangled representations. Advances in neural information processing systems, 32, 2019.
  9. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, 2013. doi: 10.1109/TPAMI.2013.50.
  10. Are disentangled representations helpful for abstract visual reasoning? In H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  11. Lost in Latent Space: Examining failures of disentangled models at combinatorial generalisation. In Advances in Neural Information Processing Systems, October 2022b.
  12. Embrace the gap: VAEs perform independent mechanism analysis. arXiv preprint arXiv:2206.02416, 2022.
  13. Contrastive learning inverts the data generating process. In International Conference on Machine Learning, pages 12979–12990. PMLR, 2021.
  14. Self-supervised learning with data augmentations provably isolates content from style. Advances in neural information processing systems, 34:16451–16467, 2021.
  15. Compositional generalization in a deep seq2seq model by separating syntax and semantics. arXiv preprint arXiv:1904.09708, 2019.
  16. Learning to recombine and resample data for compositional generalization. arXiv preprint arXiv:2010.03706, 2020.
  17. Exploiting semantics in neural machine translation with graph convolutional networks. arXiv preprint arXiv:1804.08313, 2018.
  18. Object-Centric Learning with Slot Attention. In Advances in Neural Information Processing Systems, volume 33, pages 11525–11538. Curran Associates, Inc., 2020.
  19. Matrix capsules with EM routing. In International Conference on Learning Representations, 2018.
  20. Multi-object representation learning with iterative variational inference. In International Conference on Machine Learning, pages 2424–2433. PMLR, 2019.
  21. MONet: Unsupervised Scene Decomposition and Representation, January 2019.
  22. Conditional object-centric learning from video. arXiv preprint arXiv:2111.12594, 2021.
  23. Training neural networks to encode symbols enables combinatorial generalization. Philosophical Transactions of the Royal Society B: Biological Sciences, 375(1791):20190309, February 2020. doi: 10.1098/rstb.2019.0309.
  24. Learning and generalization of compositional representations of visual scenes, March 2023.
  25. Attention is All you Need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  26. Combinatorial optimization and reasoning with graph neural networks.
  27. Bi-linear value networks for multi-goal reinforcement learning. In International Conference on Learning Representations, 2021.
  28. Complex-Valued Autoencoders for Object Discovery, November 2022.
  29. Domain adaptation–can quantity compensate for quality? Annals of Mathematics and Artificial Intelligence, 70:185–202, 2014.
  30. Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research, 8(5), 2007.
  31. Towards Out-Of-Distribution Generalization: A Survey. arXiv:2108.13624 [cs], August 2021. doi: 10.48550/arXiv.2108.13624.
  32. A theory of learning from different domains. Machine learning, 79:151–175, 2010.
  33. Learning to Extrapolate: A Transductive Approach. In The Eleventh International Conference on Learning Representations, February 2023.
  34. First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains. In NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications, October 2022.
Citations (23)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube