Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Predicting Performance for Natural Language Processing Tasks (2005.00870v1)

Published 2 May 2020 in cs.CL

Abstract: Given the complexity of combinations of tasks, languages, and domains in NLP research, it is computationally prohibitive to exhaustively test newly proposed models on each possible experimental setting. In this work, we attempt to explore the possibility of gaining plausible judgments of how well an NLP model can perform under an experimental setting, without actually training or testing the model. To do so, we build regression models to predict the evaluation score of an NLP experiment given the experimental settings as input. Experimenting on 9 different NLP tasks, we find that our predictors can produce meaningful predictions over unseen languages and different modeling architectures, outperforming reasonable baselines as well as human experts. Going further, we outline how our predictor can be used to find a small subset of representative experiments that should be run in order to obtain plausible predictions for all other experimental settings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mengzhou Xia (34 papers)
  2. Antonios Anastasopoulos (111 papers)
  3. Ruochen Xu (35 papers)
  4. Yiming Yang (151 papers)
  5. Graham Neubig (342 papers)
Citations (57)