FactAlign: Long-form Factuality Alignment of Large Language Models (2410.01691v1)

Published 2 Oct 2024 in cs.CL and cs.AI

Abstract: LLMs have demonstrated significant potential as the next-generation information access engines. However, their reliability is hindered by issues of hallucination and generating non-factual content. This is particularly problematic in long-form responses, where assessing and ensuring factual accuracy is complex. In this paper, we address this gap by proposing FactAlign, a novel alignment framework designed to enhance the factuality of LLMs' long-form responses while maintaining their helpfulness. We introduce fKTO, a fine-grained, sentence-level alignment algorithm that extends the Kahneman-Tversky Optimization (KTO) alignment method. Leveraging recent advances in automatic factuality evaluation, FactAlign utilizes fine-grained factuality assessments to guide the alignment process. Our experiments on open-domain prompts and information-seeking questions demonstrate that FactAlign significantly improves the factual accuracy of LLM responses while also improving their helpfulness. Further analyses identify that FactAlign is capable of training LLMs to provide more information without losing factual precision, thus improving the factual F1 score. Our source code, datasets, and trained models are publicly available at https://github.com/MiuLab/FactAlign

PDF HTML Abstract

FactAlign: Long-form Factuality Alignment of LLMs

The paper "FactAlign: Long-form Factuality Alignment of LLMs" addresses the persistent challenge of factual inaccuracies or hallucinations in responses generated by LLMs. This is a prevalent issue especially when generating long-form content, where ensuring factual precision is more complex due to the interwoven nature of claims within a lengthy text. The authors tackle this problem by introducing FactAlign, an innovative alignment framework crafted to bolster factuality in LLM outputs while preserving their usefulness in various query contexts.

Conceptual Framework and Methodology

FactAlign is predicated on the extension of the Kahneman-Tversky Optimization (KTO) alignment approach, using a newly proposed algorithm designated as fKTO. This fine-grained algorithm operates at the sentence level, evaluating and aligning the factual content of each statement within a generated response. By leveraging advancements in automatic factuality evaluation, FactAlign effectively utilizes sentence-level assessments to direct the alignment process.

The paper deploys fKTO alongside a factuality evaluator to parse long-form outputs into atomic statements, each analyzed against a curated knowledge corpus to determine its factual support status. This process aims to optimize the factual precision (and recall, via factual F1 score) of responses. The framework's efficacy is evidenced through rigorous testing on open-domain scenarios and information-seeking tasks, demonstrating significant improvements in factual accuracy and overall helpfulness of LLM responses.

Empirical Findings and Analysis

The authors validate their approach through experiments employing open-domain prompts and specific information-seeking questions. Notably, FactAlign shows marked enhancement in the factual accuracy of responses, as indicated by improved factual F1 scores. These numerical evaluations highlight the algorithm's proficiency in training LLMs to deliver more information-rich responses without compromising factuality.

Furthermore, the authors conduct an ablation paper to ascertain the contribution of each component, underscoring the indispensable role of fine-grained factual alignment in achieving superior factuality metrics.

Implications and Future Directions

FactAlign presents significant theoretical and practical implications for the development of LLMs. Theoretically, it informs the design of alignment frameworks by demonstrating the effectiveness of sentence-level alignment in enhancing factual precision. Practically, FactAlign contributes to the field of AI by offering a pathway to mitigate hallucinations in LLM outputs, thereby broadening their applicability in real-world settings where factual accuracy is non-negotiable.

Future explorations may delve into extending FactAlign's capabilities by integrating broader knowledge bases or real-time web data, potentially enriching the factual context within which LLM outputs are evaluated. Additionally, further refinement in automatic factuality metrics could enhance the granularity and reliability of factual assessments in dynamic and evolving knowledge domains.

In conclusion, the FactAlign framework establishes a robust foundation for future work aimed at reconciling the dual objectives of informativeness and factual accuracy in LLMs, paving the way for more reliable AI-driven communication tools in sensitive applications.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Chao-Wei Huang (28 papers)
Yun-Nung Chen (104 papers)

Citations (1)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - MiuLab/FactAlign: Source code of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models" (1 star)

Tweets

https://twitter.com/cwhuang_wh/status/1841837003077517756