Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revisiting Automatic Question Summarization Evaluation in the Biomedical Domain (2303.10328v1)

Published 18 Mar 2023 in cs.CL

Abstract: Automatic evaluation metrics have been facilitating the rapid development of automatic summarization methods by providing instant and fair assessments of the quality of summaries. Most metrics have been developed for the general domain, especially news and meeting notes, or other language-generation tasks. However, these metrics are applied to evaluate summarization systems in different domains, such as biomedical question summarization. To better understand whether commonly used evaluation metrics are capable of evaluating automatic summarization in the biomedical domain, we conduct human evaluations of summarization quality from four different aspects of a biomedical question summarization task. Based on human judgments, we identify different noteworthy features for current automatic metrics and summarization systems as well. We also release a dataset of our human annotations to aid the research of summarization evaluation metrics in the biomedical domain.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hongyi Yuan (23 papers)
  2. Yaoyun Zhang (6 papers)
  3. Fei Huang (408 papers)
  4. Songfang Huang (51 papers)
Citations (1)