Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 92 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 36 tok/s
GPT-5 High 36 tok/s Pro
GPT-4o 113 tok/s
GPT OSS 120B 472 tok/s Pro
Kimi K2 214 tok/s Pro
2000 character limit reached

Evaluation for Change (2212.11670v1)

Published 20 Dec 2022 in cs.CL, cs.AI, and cs.LG

Abstract: Evaluation is the central means for assessing, understanding, and communicating about NLP models. In this position paper, we argue evaluation should be more than that: it is a force for driving change, carrying a sociological and political character beyond its technical dimensions. As a force, evaluation's power arises from its adoption: under our view, evaluation succeeds when it achieves the desired change in the field. Further, by framing evaluation as a force, we consider how it competes with other forces. Under our analysis, we conjecture that the current trajectory of NLP suggests evaluation's power is waning, in spite of its potential for realizing more pluralistic ambitions in the field. We conclude by discussing the legitimacy of this power, who acquires this power and how it distributes. Ultimately, we hope the research community will more aggressively harness evaluation for change.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (1)

X Twitter Logo Streamline Icon: https://streamlinehq.com