Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

38 tokens/sec

GPT-4o

59 tokens/sec

Gemini 2.5 Pro Pro

41 tokens/sec

o3 Pro

7 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Learning to Plan and Generate Text with Citations (2404.03381v3)

Published 4 Apr 2024 in cs.CL

Abstract: The increasing demand for the deployment of LLMs in information-seeking scenarios has spurred efforts in creating verifiable systems, which generate responses to queries along with supporting evidence. In this paper, we explore the attribution capabilities of plan-based models which have been recently shown to improve the faithfulness, grounding, and controllability of generated text. We conceptualize plans as a sequence of questions which serve as blueprints of the generated content and its organization. We propose two attribution models that utilize different variants of blueprints, an abstractive model where questions are generated from scratch, and an extractive model where questions are copied from the input. Experiments on long-form question-answering show that planning consistently improves attribution quality. Moreover, the citations generated by blueprint models are more accurate compared to those obtained from LLM-based pipelines lacking a planning component.

PDF HTML Abstract

Exploring Attribution in Plan-Based Models for Text Generation with Citations

Introduction to Attribution in Text Generation

Recent advancements in generative AI have presented new challenges and opportunities in the development of verifiable systems capable of producing text alongside supporting evidence. This research focuses on enhancing the generation of long-form responses to queries by integrating attribution mechanisms into the plan-based models of text generation.

The Core Challenges

Two primary challenges are addressed:

Attribution Quality: How can models produce responses with high-quality citations that are factually accurate and faithfully represented?
Plan-Based Text Generation: How can blueprint plans, conceptualized as sequences of questions, improve the structure, faithfulness, and citation accuracy of the generated content?

Methodology and Models

The paper introduces models based on two blueprint strategies:

Abstractive Blueprint Models, where generated questions form a structured plan to guide the content generation process.
Extractive Blueprint Models, which construct blueprints by selecting relevant questions directly from the input data.

Both models were compared against baseline systems without planning components, evaluating their effectiveness in terms of output quality and attribution accuracy.

Key Findings and Results

The research reveals that blueprint models consistently improve both the quality of generated content and the accuracy of citations. Notably, the extractive blueprint model exhibits significant advancements in summary quality, suggesting a robust approach to integrating planning and attribution mechanisms.

Quantitative Analysis shows:

An improvement in ROUGE-L scores, indicating better content relevance and structure.
Higher ANLI scores, reflecting enhanced factual consistency and faithfulness.
Superior attribution quality, as evidenced by improved AutoAIS metrics.

Implications and Future Directions

This paper underscores the potential of blueprint models in fostering more faithful and verifiable text generation systems. The findings suggest that planning mechanisms not only aid in structuring generated content but also play a crucial role in enhancing citation accuracy.

Practical Implications include:

The utilization of blueprint models in information retrieval and summarization tasks, especially those requiring verifiable sources.
Improvement in user trust towards AI-generated content through transparent attribution.

Theoretical Implications involve:

Validation of the hypothesis that explicit content planning can lead to improved generation fidelity and source attribution.
Demonstration of the transferability of attribution skills across different information-seeking tasks and domains.

Looking ahead, further research could explore the integration of blueprint models with larger and more complex datasets, expanding their applicability and understanding of their limitations. Additionally, future work might delve into the dynamics between different blueprint strategies and their impact on the diversity and comprehensiveness of generated content.

Conclusion

This research marks a significant step towards developing text generation models that not only produce coherent and relevant responses but also attribute their sources accurately. By leveraging blueprint plans, it opens new avenues for improving the reliability and trustworthiness of AI-generated content, addressing critical challenges in the field of generative AI and information verification.

PDF Markdown Bookmark Chat (Pro)

References (49)

Authors (7)

Constanza Fierro (11 papers)
Reinald Kim Amplayo (28 papers)
Fantine Huot (19 papers)
Nicola De Cao (21 papers)
Joshua Maynez (28 papers)
Shashi Narayan (35 papers)
Mirella Lapata (135 papers)

Citations (13)

View on Semantic Scholar

Tweets

https://twitter.com/fly51fly/status/1776373129855807998

https://twitter.com/knishimae0531/status/1776446696911933895