Determine LLM Summarization Accuracy for Multi-Document Sensemaking
Determine how accurately large language models can summarize when analyzing multiple given documents in sensemaking tasks, ideally by evaluating their outputs against established ground-truth summaries to quantify performance.
References
We lack an understanding of how accurately LLMs can summarize when analyzing multiple given documents in sensemaking tasks.
— Steering LLM Summarization with Visual Workspaces for Sensemaking
(2409.17289 - Tang et al., 2024) in Introduction