Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Prompting Large Language Models as Explainable Metrics (2311.11552v1)

Published 20 Nov 2023 in cs.CL, cs.AI, and cs.LG

Abstract: This paper describes the IUST NLP Lab submission to the Prompting LLMs as Explainable Metrics Shared Task at the Eval4NLP 2023 Workshop on Evaluation & Comparison of NLP Systems. We have proposed a zero-shot prompt-based strategy for explainable evaluation of the summarization task using LLMs. The conducted experiments demonstrate the promising potential of LLMs as evaluation metrics in NLP, particularly in the field of summarization. Both few-shot and zero-shot approaches are employed in these experiments. The performance of our best provided prompts achieved a Kendall correlation of 0.477 with human evaluations in the text summarization task on the test data. Code and results are publicly available on GitHub.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Ghazaleh Mahmoudi (3 papers)
Citations (4)
Youtube Logo Streamline Icon: https://streamlinehq.com