Agenda-based Narrative Extraction: Steering Pathfinding Algorithms with Large Language Models

Published 31 Mar 2026 in cs.CL, cs.AI, and cs.IR | (2603.29661v1)

Abstract: Existing narrative extraction methods face a trade-off between coherence, interactivity, and multi-storyline support. Narrative Maps supports rich interaction and generates multiple storylines as a byproduct of its coverage constraints, though this comes at the cost of individual path coherence. Narrative Trails achieves high coherence through maximum capacity path optimization but provides no mechanism for user guidance or multiple perspectives. We introduce agenda-based narrative extraction, a method that bridges this gap by integrating LLMs into the Narrative Trails pathfinding process to steer storyline construction toward user-specified perspectives. Our approach uses an LLM at each step to rank candidate documents based on their alignment with a given agenda while maintaining narrative coherence. Running the algorithm with different agendas yields different storylines through the same corpus. We evaluated our approach on a news article corpus using LLM judges with Claude Opus 4.5 and GPT 5.1, measuring both coherence and agenda alignment across 64 endpoint pairs and 6 agendas. LLM-driven steering achieves 9.9% higher alignment than keyword matching on semantic agendas (p=0.017), with 13.3% improvement on \textit{Regime Crackdown} specifically (p=0.037), while keyword matching remains competitive on agendas with literal keyword overlap. The coherence cost is minimal: LLM steering reduces coherence by only 2.2% compared to the agenda-agnostic baseline. Counter-agendas that contradict the source material score uniformly low (2.2-2.5) across all methods, confirming that steering cannot fabricate unsupported narratives.

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces an LLM-steered pathfinding approach that integrates user-defined agendas to dynamically generate multiperspective narrative paths.
The method achieves up to 13.3% improved agenda alignment with minimal coherence loss of only 2.2% compared to traditional baseline methods.
Rigorous evaluation using dual LLM judges and sensitivity analyses highlights the practical robustness and data-groundedness of the narrative extraction framework.

Agenda-Based Narrative Extraction via LLM-Steered Pathfinding

Motivation and Context

Narrative extraction from document collections is central to tasks such as news summarization, sensemaking, and intelligence analysis. Traditional approaches to document-level storyline extraction present a tension between maximizing narrative coherence, supporting user interactivity, and enabling multiperspective analysis. Prior systems such as Narrative Maps generate multiple interconnected storylines, delivering interaction and coverage, but at the expense of individual path coherence. In contrast, Narrative Trails optimizes for the most coherent single storyline utilizing a maximum-capacity bottleneck path approach but remains rigid, offering no mechanisms for user guidance or alternative perspective generation.

This work proposes an intermediate approach: agenda-based narrative extraction. By integrating LLMs into the pathfinding process, agenda-based extraction introduces interactive, user-defined steering on top of the high-coherence guarantees of Narrative Trails. This enables the dynamic generation of alternative, perspective-driven narratives from the same corpus, while quantifying and minimizing the associated trade-offs in coherence.

Methodological Contributions

The paper develops an LLM-steered pathfinding algorithm that generalizes the maximum capacity approach. Rather than greedily following the most coherent transition at each step, the algorithm selects a candidate set of top- $k$ neighbors (by edge coherence) and uses an LLM to rerank them according to their alignment with a user-specified natural language agenda. Candidate agenda types include literal (keyword-matched), semantic (requiring non-trivial inference beyond the surface form), and counter (contradicting the corpus consensus and serving as negative controls).

Prompts for LLM scoring are constructed to provide narrative context, current state, destination, and agenda, instructing the model to return a strict ranking of candidate continuations. For evaluation, the authors utilize state-of-the-art LLMs (Claude Opus 4.5, GPT-5.1) as judges to score the resulting narratives on coherence (logical flow, thematic consistency, temporal order, completeness) and agenda-alignment (agenda support, persuasiveness, evidentiary strength, directionality, and bias effectiveness). This dual-judge strategy mitigates single-model biases and leverages prior findings on LLM-judge reliability.

The approach is benchmarked on a temporally and topically rich news corpus detailing the 2021 Cuban protests, evaluating 64 endpoint pairs across six agenda types.

Empirical Results

The experimental results quantify the trade-off between coherence and agenda alignment introduced by LLM-based steering:

LLM-based steering yields 9.9% higher alignment on semantic (inference-based) agendas versus keyword matching ( $p=0.017$ ) and a 13.3% gain on the "Regime Crackdown" agenda ( $p=0.037$ ).
For literal, keyword-aligned agendas, keyword matching is competitive or superior, indicating that semantic LLM steering is most valuable where surface-form cues are insufficient.
Coherence loss due to LLM steering is minimal, only 2.2% below the coherence of the agenda-agnostic maximum-capacity baseline.

The interdependence between coherence and agenda support across all method/agenda pairs is weak (Pearson $r\approx0.10$ ), suggesting that high alignment is achievable without a fundamental reduction in coherence.

Figure 1: Percentage differences in coherence and alignment with respect to maximum capacity baseline for different steering approaches.

Visualization of agenda-steered narrative paths demonstrates how different framing choices lead to diverging trajectories through the corpus UMAP space, with clear partitioning based on the imposed agenda. The system's inability to fabricate high-scoring counter-agenda narratives (all methods yield low alignment for agendas that contradict the majority of the corpus) underscores the inherent data-groundedness of the approach.

Figure 2: Visualization of how agendas produce distinct paths through the embedding space; each colored path represents a different agenda steering the Narrative Trails framework.

Figure 3: Narrative map showing the topological separation and convergence of storylines under agenda steering between the same endpoints.

Sensitivity and Robustness

Sensitivity analysis explores the impact of candidate set size, LLM temperature, and model size. Results for these parameters are stable, and default choices provide a robust trade-off between cost and performance. Prompt engineering has a non-trivial influence: chain-of-thought (CoT) prompting boosts agenda alignment by 25.7% at an increased compute cost, and path overlap between direct and CoT prompting for the same agenda is low (Jaccard similarity $\approx 0.58$ ), which demonstrates that both model reasoning process and prompt granularity meaningfully affect selection.

Theoretical and Practical Implications

The findings articulate a new point in the design space of narrative extraction, demonstrating that user-steered, multiperspective, and highly coherent narrative construction is attainable by leveraging LLMs as pathfinding rerankers. For applications in intelligence, journalism, and digital humanities, this approach dramatically improves analysts' ability to interactively generate and compare alternative storylines by agenda, without complicated manual graph operations. Importantly, because counter-agenda steering fails to fabricate unsupported narratives, the system's operations remain tightly grounded in the empirical evidence encoded in the underlying dataset.

More broadly, this work demonstrates a template for integrating LLM-guided selection into constrained combinatorial optimization frameworks—here, pathfinding on coherence graphs—opening further avenues for mixed-initiative, user-in-the-loop narrative analytics. Potential applications extend to customizable history generation, multiperspective news analysis, and explainability in retrieval pipelines.

Limitations and Future Directions

While demonstrating robust performance on a single, moderately-sized corpus and with a canonical set of agendas, generalization to other domains, languages, and agenda types requires further empirical validation. Evaluation relies on LLMs as judges; future work should include human subject experiments to ensure LLM-judged narrative quality aligns with the requirements of target analysts.

Interactive deployment at scale is currently infeasible due to LLM inference latency, indicating a need for further model distillation or caching strategies. The ethical risk of misusing agenda-driven selection for narrative manipulation is acknowledged; the explicit nature of agenda specification in this system supports transparency and auditability.

Conclusion

Agenda-based narrative extraction via LLM-steered pathfinding quantitatively bridges the gap between high-coherence narrative construction and interactive, multiperspective exploration (2603.29661). It advances the state of the art in narrative extraction by enabling perspective-driven narrative synthesis with negligible loss of path coherence and with strong robustness against fabrication of unsupported narratives. The integration of LLMs as combinatorial rerankers in pathfinding opens important avenues for mixed-initiative analytics, offering valuable tools for both research and application in AI-driven sensemaking and digital historiography.

Markdown Report Issue