Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
95 tokens/sec
Gemini 2.5 Pro Premium
52 tokens/sec
GPT-5 Medium
20 tokens/sec
GPT-5 High Premium
28 tokens/sec
GPT-4o
100 tokens/sec
DeepSeek R1 via Azure Premium
98 tokens/sec
GPT OSS 120B via Groq Premium
459 tokens/sec
Kimi K2 via Groq Premium
197 tokens/sec
2000 character limit reached

Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models (2505.19474v1)

Published 26 May 2025 in cs.AI

Abstract: Multimodal LLMs (MLLMs) have demonstrated strong performance in visual understanding tasks, yet they often suffer from object hallucinations--generating descriptions of objects that are inconsistent with or entirely absent from the input. This issue is closely related to dataset biases, where frequent co-occurrences of objects lead to entangled semantic representations across modalities. As a result, models may erroneously activate object representations that are commonly associated with the input but not actually present. To address this, we propose a causality-driven disentanglement framework that mitigates hallucinations through causal intervention. Our approach includes a Causal-Driven Projector in the visual pathway and a Causal Intervention Module integrated into the final transformer layer of the LLM. These components work together to reduce spurious correlations caused by biased training data. Experimental results show that our method significantly reduces hallucinations while maintaining strong performance on multiple multimodal benchmarks. Visualization analyses further confirm improved separability of object representations. The code is available at: https://github.com/IgniSavium/Causal-LLaVA

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube