Enhancement of Multimodal LLMs through Hallucination-Aware Direct Preference Optimization
The paper under review addresses the persistent hallucination problem observed in multimodal LLMs (LVLMs), particularly in the generation of image-based textual descriptions. Despite the advancements in LVLMs, the hallucination issue — where models fabricate or inaccurately depict image-related content — remains a significant challenge, potentially leading to serious misleading information in applications like medical diagnostics.
The authors propose a novel framework called Hallucination-Aware Direct Preference Optimization (HA-DPO) that reframes the mitigation of hallucinations as a preference optimization problem. HA-DPO biases the model towards generating non-hallucinated responses by presenting the model with pairs of responses for the same image: one hallucinatory and one accurate. This process is facilitated by a curated pipeline for constructing these preference pairs, ensuring dataset consistency in style and quality.
Through empirical validation, HA-DPO demonstrated substantial improvements in reducing hallucination occurrences in several state-of-the-art LVLMs. Specifically, when applied to MiniGPT-4, the POPE accuracy was improved from 51.13% to 86.13%, indicating an absolute gain of 35%. Additionally, the MME score showed a relative increase of 42.32%, from 932.00 to 1326.46. These results underscore the efficacy of HA-DPO not only in curbing hallucinations but also in enhancing the models' generalization capabilities.
The paper provides significant contributions to both the practical and theoretical domain of multimodal AI research. Practically, HA-DPO offers a lightweight, scalable solution to hallucination issues without the need for extensive and costly data annotation or retraining processes. Theoretically, the preference learning strategy in HA-DPO augments the understanding of model biases, facilitating advancements in model alignment and reliability.
Furthermore, the introduction of the Sentence-level Hallucination Ratio (SHR) offers a comprehensive and quantitative framework for evaluating hallucinations in LVLMs, expanding beyond existing benchmarks which typically focus on predefined categories and fail to account for a broader range of hallucination types. SHR enables researchers to effectively quantify hallucinations by evaluating sentence-level discrepancies against factual image data, thus providing an intuitive metric for further research and model development.
In terms of future developments, HA-DPO holds promise for broader application across other modalities and can potentially be adapted for the improvement of more generalized LLMs. As the field of AI continues to evolve, employing strategies like HA-DPO to address model biases and enhance model-objectivity will be critical for reliable deployment in real-world scenarios where accuracy and context are paramount.