Evaluate LLM summarization in lower-resource languages beyond English and Chinese
Determine how effectively large language models perform summarization in lower-resource languages beyond English and Chinese, assessing performance across multiple domains using fine-grained evaluation criteria such as faithfulness, completeness, and conciseness to establish multilingual robustness.
Sponsor
References
Evaluating how effectively LLMs handle other lower-resource languages remains an open question.
— Towards Multi-dimensional Evaluation of LLM Summarization across Domains and Languages
(2506.00549 - Min et al., 31 May 2025) in Limitations (Section*, third paragraph)