Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Re-Evaluating GermEval17 Using German Pre-Trained Language Models (2102.12330v2)

Published 24 Feb 2021 in cs.CL and cs.LG

Abstract: The lack of a commonly used benchmark data set (collection) such as (Super-)GLUE (Wang et al., 2018, 2019) for the evaluation of non-English pre-trained LLMs is a severe shortcoming of current English-centric NLP-research. It concentrates a large part of the research on English, neglecting the uncertainty when transferring conclusions found for the English language to other languages. We evaluate the performance of the German and multilingual BERT-based models currently available via the huggingface transformers library on the four tasks of the GermEval17 workshop. We compare them to pre-BERT architectures (Wojatzki et al., 2017; Schmitt et al., 2018; Attia et al., 2018) as well as to an ELMo-based architecture (Biesialska et al., 2020) and a BERT-based approach (Guhr et al., 2020). The observed improvements are put in relation to those for similar tasks and similar models (pre-BERT vs. BERT-based) for the English language in order to draw tentative conclusions about whether the observed improvements are transferable to German or potentially other related languages.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. M. Aßenmacher (4 papers)
  2. A. Corvonato (1 paper)
  3. C. Heumann (3 papers)
Citations (6)