Dice Question Streamline Icon: https://streamlinehq.com

Data-centric analysis of compositional generalization in multimodal models

Investigate, from a data-centric perspective, how multimodal models can understand and generalize to novel compositions of concepts not encountered during pretraining, thereby characterizing and improving compositional generalization distinct from traditional zero-shot learning.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper distinguishes compositional generalization—understanding new combinations of concepts—from conventional zero-shot learning and highlights it as an important yet unresolved area.

They explicitly call out the need to analyze compositional generalization through data-centric methods, suggesting that current approaches and empirical understanding are insufficient.

References

This is distinct from traditional zero-shot learning and presents an intriguing, yet unresolved challenge: analyzing compositional generalization from a data-centric perspective.

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance (2404.04125 - Udandarao et al., 4 Apr 2024) in Section: Conclusions and Open Problems