Determine whether multimodality confers System 2 competence in large language models
Determine whether multi-modal large language models, such as GPT-4V, attain System 2 competence—i.e., deliberate reasoning and planning abilities as characterized by Kahneman’s System 2—by virtue of their added modalities, rather than merely expanding System 1 reflexive, pattern-completion capabilities.
References
While multi-modality is a great addition that increases the coverage of their System 1 imagination (Figure 1), it is not clear that this gives them System 2 competence.
                — LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks
                
                (2402.01817 - Kambhampati et al., 2 Feb 2024) in Section 4, Related Work