Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder (2207.04660v1)

Published 11 Jul 2022 in cs.CL and cs.IR

Abstract: Text summarization models are often trained to produce summaries that meet human quality requirements. However, the existing evaluation metrics for summary text are only rough proxies for summary quality, suffering from low correlation with human scoring and inhibition of summary diversity. To solve these problems, we propose SummScore, a comprehensive metric for summary quality evaluation based on CrossEncoder. Firstly, by adopting the original-summary measurement mode and comparing the semantics of the original text, SummScore gets rid of the inhibition of summary diversity. With the help of the text-matching pre-training Cross-Encoder, SummScore can effectively capture the subtle differences between the semantics of summaries. Secondly, to improve the comprehensiveness and interpretability, SummScore consists of four fine-grained submodels, which measure Coherence, Consistency, Fluency, and Relevance separately. We use semi-supervised multi-rounds of training to improve the performance of our model on extremely limited annotated data. Extensive experiments show that SummScore significantly outperforms existing evaluation metrics in the above four dimensions in correlation with human scoring. We also provide the quality evaluation results of SummScore on 16 mainstream summarization models for later research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Wuhang Lin (2 papers)
  2. Shasha Li (57 papers)
  3. Chen Zhang (403 papers)
  4. Bin Ji (28 papers)
  5. Jie Yu (98 papers)
  6. Jun Ma (347 papers)
  7. Zibo Yi (3 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.