Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models (2412.04939v1)

Published 6 Dec 2024 in cs.CV

Abstract: Multimodal LLMs (MLLMs) have garnered significant attention recently and demonstrate outstanding capabilities in various tasks such as OCR, VQA, captioning, $\textit{etc}$. However, hallucination remains a persistent issue. While numerous methods have been proposed to mitigate hallucinations, achieving notable improvements, these methods primarily focus on mitigating hallucinations about $\textbf{object/noun-related}$ concepts. Verb concepts, crucial for understanding human actions, have been largely overlooked. In this paper, to the best of our knowledge, we are the $\textbf{first}$ to investigate the $\textbf{verb hallucination}$ phenomenon of MLLMs from various perspectives. Our findings reveal that most state-of-the-art MLLMs suffer from severe verb hallucination. To assess the effectiveness of existing mitigation methods for object concept hallucination on verb hallucination, we evaluated these methods and found that they do not effectively address verb hallucination. To address this issue, we propose a novel rich verb knowledge-based tuning method to mitigate verb hallucination. The experiment results demonstrate that our method significantly reduces hallucinations related to verbs. $\textit{Our code and data will be made publicly available}$.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zehao Wang (38 papers)
  2. Xinpeng Liu (19 papers)
  3. Xiaoqian Wu (8 papers)
  4. Yudonglin Zhang (1 paper)
  5. Zhou Fang (41 papers)
  6. Yifan Fang (1 paper)
  7. Junfu Pu (11 papers)
  8. Cewu Lu (203 papers)
  9. Yong-Lu Li (47 papers)