Influence of reward design on latent visual reasoning in MLLMs
Investigate how different reinforcement learning reward designs influence latent visual reasoning in multimodal large language models that generate continuous latent visual embeddings as intermediate thoughts during reasoning, and determine the impact of alternative reward functions on latent-embedding quality and overall task performance.
References
Second, we have not yet explored how different reward designs might influence latent visual reasoning in MLLMs, leaving room for exploration and further enhancement.
— Monet: Reasoning in Latent Visual Space Beyond Images and Language
(2511.21395 - Wang et al., 26 Nov 2025) in Section 6 Conclusion and Limitations