Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluation of Transfer Learning for Polish with a Text-to-Text Model (2205.08808v1)

Published 18 May 2022 in cs.CL and cs.LG

Abstract: We introduce a new benchmark for assessing the quality of text-to-text models for Polish. The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering. In particular, since summarization and question answering lack benchmark datasets for the Polish language, we describe their construction and make them publicly available. Additionally, we present plT5 - a general-purpose text-to-text model for Polish that can be fine-tuned on various NLP tasks with a single training objective. Unsupervised denoising pre-training is performed efficiently by initializing the model weights with a multi-lingual T5 (mT5) counterpart. We evaluate the performance of plT5, mT5, Polish BART (plBART), and Polish GPT-2 (papuGaPT2). The plT5 scores top on all of these tasks except summarization, where plBART is best. In general (except for summarization), the larger the model, the better the results. The encoder-decoder architectures prove to be better than the decoder-only equivalent.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Aleksandra Chrabrowa (2 papers)
  2. Łukasz Dragan (1 paper)
  3. Karol Grzegorczyk (3 papers)
  4. Dariusz Kajtoch (10 papers)
  5. Mikołaj Koszowski (3 papers)
  6. Robert Mroczkowski (4 papers)
  7. Piotr Rybak (10 papers)
Citations (18)