Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models (2205.01523v1)

Published 3 May 2022 in cs.CL

Abstract: Nowadays, pretrained LLMs (PLMs) have dominated the majority of NLP tasks. While, little research has been conducted on systematically evaluating the language abilities of PLMs. In this paper, we present a large-scale empirical study on general language ability evaluation of PLMs (ElitePLM). In our study, we design four evaluation dimensions, i.e. memory, comprehension, reasoning, and composition, to measure ten widely-used PLMs within five categories. Our empirical results demonstrate that: (1) PLMs with varying training objectives and strategies are good at different ability tests; (2) fine-tuning PLMs in downstream tasks is usually sensitive to the data size and distribution; (3) PLMs have excellent transferability between similar tasks. Moreover, the prediction results of PLMs in our experiments are released as an open resource for more deep and detailed analysis on the language abilities of PLMs. This paper can guide the future work to select, apply, and design PLMs for specific tasks. We have made all the details of experiments publicly available at https://github.com/RUCAIBox/ElitePLM.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Junyi Li (92 papers)
  2. Tianyi Tang (30 papers)
  3. Zheng Gong (69 papers)
  4. Lixin Yang (27 papers)
  5. Zhuohao Yu (15 papers)
  6. Zhipeng Chen (46 papers)
  7. Jingyuan Wang (64 papers)
  8. Wayne Xin Zhao (196 papers)
  9. Ji-Rong Wen (299 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub