Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback (2404.14233v1)
Abstract: The rapidly developing Large Vision LLMs (LVLMs) have shown notable capabilities on a range of multi-modal tasks, but still face the hallucination phenomena where the generated texts do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g., labeling by proprietary models or human experts). To address these issues, we propose detecting and mitigating hallucinations in LVLMs via fine-grained AI feedback. The basic idea is that we generate a small-size sentence-level hallucination annotation dataset by proprietary models, whereby we train a hallucination detection model which can perform sentence-level hallucination detection, covering primary hallucination types (i.e., object, attribute, and relationship). Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for training hallucination mitigating model. Furthermore, we propose differentiating the severity of hallucinations, and introducing a Hallucination Severity-Aware Direct Preference Optimization (HSA-DPO) for mitigating hallucination in LVLMs by incorporating the severity of hallucinations into preference learning. Extensive experiments demonstrate the effectiveness of our method.
- Qwen-vl: A versatile vision-language model for understanding, localization, text reading, and beyond. (2023).
- Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073 (2022).
- Shikra: Unleashing Multimodal LLM’s Referential Dialogue Magic. arXiv preprint arXiv:2306.15195 (2023).
- Unified Hallucination Detection for Multimodal Large Language Models. arXiv preprint arXiv:2402.03190 (2024).
- Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks. arXiv preprint arXiv:2312.14238 (2023).
- Ultrafeedback: Boosting language models with high-quality feedback. arXiv preprint arXiv:2310.01377 (2023).
- Instructblip: Towards general-purpose vision-language models with instruction tuning. Advances in Neural Information Processing Systems 36 (2024).
- Detecting and preventing hallucinations in large vision language models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 18135–18143.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
- Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation. arXiv preprint arXiv:2311.17911 (2023).
- Mistral 7B. arXiv preprint arXiv:2310.06825 (2023).
- Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision 123 (2017), 32–73.
- Rlaif: Scaling reinforcement learning from human feedback with ai feedback. arXiv preprint arXiv:2309.00267 (2023).
- Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In International conference on machine learning. PMLR, 19730–19742.
- Silkie: Preference distillation for large visual language models. arXiv preprint arXiv:2312.10665 (2023).
- Evaluating object hallucination in large vision-language models. arXiv preprint arXiv:2305.10355 (2023).
- Mitigating hallucination in large multi-modal models via robust instruction tuning. In The Twelfth International Conference on Learning Representations.
- Improved baselines with visual instruction tuning. arXiv preprint arXiv:2310.03744 (2023).
- Visual instruction tuning. Advances in neural information processing systems 36 (2024).
- A survey on hallucination in large vision-language models. arXiv preprint arXiv:2402.00253 (2024).
- OPenAI. 2023. GPT-4V(ision) system card. (2023).
- OpenAI OpenAI. 2022. Introducing ChatGPT. (2022).
- OpenAI OpenAI. 2023. GPT-4 Technical Report. (Mar 2023).
- Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems 36 (2024).
- Object hallucination in image captioning. arXiv preprint arXiv:1809.02156 (2018).
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
- Aligning large multimodal models with factually augmented rlhf. arXiv preprint arXiv:2309.14525 (2023).
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
- An llm-free multi-dimensional benchmark for mllms hallucination evaluation. arXiv preprint arXiv:2311.07397 (2023).
- Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 35 (2022), 24824–24837.
- A survey on multimodal large language models. arXiv preprint arXiv:2306.13549 (2023).
- Woodpecker: Hallucination correction for multimodal large language models. arXiv preprint arXiv:2310.16045 (2023).
- Reformulating vision-language foundation models and datasets towards universal multimodal assistants. arXiv preprint arXiv:2310.00653 (2023).
- Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback. arXiv preprint arXiv:2312.00849 (2023).
- Self-rewarding language models. arXiv preprint arXiv:2401.10020 (2024).
- Beyond hallucinations: Enhancing lvlms through hallucination-aware direct preference optimization. arXiv preprint arXiv:2311.16839 (2023).
- Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592 (2023).
- Wenyi Xiao (12 papers)
- Ziwei Huang (22 papers)
- Leilei Gan (21 papers)
- Wanggui He (17 papers)
- Haoyuan Li (62 papers)
- Zhelun Yu (11 papers)
- Hao Jiang (228 papers)
- Fei Wu (317 papers)
- Linchao Zhu (78 papers)