Definition of a Meaningful Interleaved Chain-of-Thought
Determine rigorous and operational criteria that define a meaningful interleaved chain-of-thought in multimodal reasoning, specifying how textual tokens and image tokens should interact as complementary modalities to mutually advance reasoning beyond mere isomorphic representations, and establishing evaluation procedures to verify these criteria across diverse tasks.
References
Multimodal reasoning requires iterative coordination between language and vision, yet it remains unclear what constitutes a meaningful interleaved chain of thought.
— ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
(2510.27492 - Gu et al., 30 Oct 2025) in Abstract (page 1)