Data-centric analysis of compositional generalization in multimodal models
Investigate, from a data-centric perspective, how multimodal models can understand and generalize to novel compositions of concepts not encountered during pretraining, thereby characterizing and improving compositional generalization distinct from traditional zero-shot learning.
References
This is distinct from traditional zero-shot learning and presents an intriguing, yet unresolved challenge: analyzing compositional generalization from a data-centric perspective.
— No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
(2404.04125 - Udandarao et al., 4 Apr 2024) in Section: Conclusions and Open Problems