Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 56 tok/s Pro
GPT-5 Medium 17 tok/s
GPT-5 High 21 tok/s Pro
GPT-4o 90 tok/s
GPT OSS 120B 468 tok/s Pro
Kimi K2 213 tok/s Pro
2000 character limit reached

Dual Thinking and Logical Processing -- Are Multi-modal Large Language Models Closing the Gap with Human Vision ? (2406.06967v4)

Published 11 Jun 2024 in cs.CV, cs.AI, and eess.IV

Abstract: The dual thinking framework considers fast, intuitive, and slower logical processing. The perception of dual thinking in vision requires images where inferences from intuitive and logical processing differ, and the latter is under-explored in current studies. We introduce a novel adversarial dataset to provide evidence for the dual thinking framework in human vision, which also facilitates the study of the qualitative behavior of deep learning models. Our psychophysical studies show the presence of multiple inferences in rapid succession, and analysis of errors shows that the early stopping of visual processing can result in missing relevant information. MLLMs (Multi-modal LLMs) and VLMs (Vision LLMs) have made significant progress in correcting errors in intuitive processing in human vision and showed enhanced performance on images requiring logical processing. However, their improvements in logical processing have not kept pace with their advancements in intuitive processing. In contrast, segmentation models exhibit errors similar to those seen in intuitive human processing and lack understanding of sub-structures, as indicated by errors related to sub-components in identified instances. As AI (Artificial Intelligence)-based systems find increasing applications in safety-critical domains like autonomous driving, the integration of logical processing capabilities becomes essential. This not only enhances performance but also addresses the limitations of scaling-based approaches while ensuring robustness and reliability in real-world environments.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.