CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models (2506.23590v1)

Published 30 Jun 2025 in cs.CV

Abstract: Although Large Vision-LLMs (LVLMs) have demonstrated powerful capabilities in interpreting visual information, they frequently produce content that deviates from visual information, leading to object hallucination. To tackle this, recent works mostly depend on expensive manual annotations and training cost, or significantly increase inference time. In this work, we observe that LVLMs' attention to visual information is significantly stronger when answering caption queries compared to non-caption queries. Inspired by this phenomenon, we propose Caption-sensitive Attention Intervention (CAI), a training-free, plug-and-play hallucination mitigation method that leverages the attention activation pattern in response to caption queries to enhance LVLMs' visual perception capability. Extensive experimental results across four benchmarks covering both discriminative and generative tasks, demonstrate that CAI achieves state-of-the-art (SOTA) hallucination mitigating performance only with minimal additional inference cost.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models (2506.23590v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Authors (11)

Don't miss out on important new AI/ML research

CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models (2506.23590v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (11)

Don't miss out on important new AI/ML research