Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning (2404.12966v4)

Published 19 Apr 2024 in cs.CV and cs.AI

Abstract: Recently, Multimodal LLMs (MLLMs) have achieved significant success across multiple disciplines due to their exceptional instruction-following capabilities and extensive world knowledge. However, whether these MLLMs possess human-like compositional reasoning abilities remains an open problem. To unveil their reasoning behaviors, we first curate a \textbf{M}ultimodal \textbf{A}ssumptive \textbf{R}ea\textbf{s}oning Benchmark (MARS-Bench) in this paper. Interestingly, we find that most prevalent MLLMs can be easily fooled by the introduction of a presupposition into the question, whereas such presuppositions appear naive to human reasoning. Besides, we also propose a simple yet effective method, Active Deduction (AD), to encourage the model to actively perform composite deduction before reaching a final decision. Equipped with the proposed AD method, a MLLM demonstrates significant improvements in assumptive reasoning abilities without compromising its general-purpose question-answering performance. We also provide extensive evaluations of both open-source and private MLLMs on MARS-Bench, along with experimental analyses of the AD method.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (53)

Authors (6)

Yian Li (7 papers)
Wentao Tian (2 papers)
Yang Jiao (127 papers)
Jingjing Chen (99 papers)
Yu-Gang Jiang (223 papers)
Na Zhao (54 papers)

Citations (9)

View on Semantic Scholar

Tweets

https://twitter.com/CSVisionPapers/status/1782535354077020412

Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning (2404.12966v4)

Related Papers

Tweets