Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards the Scalable Evaluation of Cooperativeness in Language Models

Published 16 Mar 2023 in cs.CL and cs.LG | (2303.13360v1)

Abstract: It is likely that AI systems driven by pre-trained LLMs (PLMs) will increasingly be used to assist humans in high-stakes interactions with other agents, such as negotiation or conflict resolution. Consistent with the goals of Cooperative AI \citep{dafoe_open_2020}, we wish to understand and shape the multi-agent behaviors of PLMs in a pro-social manner. An important first step is the evaluation of model behaviour across diverse cooperation problems. Since desired behaviour in an interaction depends upon precise game-theoretic structure, we focus on generating scenarios with particular structures with both crowdworkers and a LLM. Our work proceeds as follows. First, we discuss key methodological issues in the generation of scenarios corresponding to particular game-theoretic structures. Second, we employ both crowdworkers and a LLM to generate such scenarios. We find that the quality of generations tends to be mediocre in both cases. We additionally get both crowdworkers and a LLM to judge whether given scenarios align with their intended game-theoretic structure, finding mixed results depending on the game. Third, we provide a dataset of scenario based on our data generated. We provide both quantitative and qualitative evaluations of UnifiedQA and GPT-3 on this dataset. We find that instruct-tuned models tend to act in a way that could be perceived as cooperative when scaled up, while other models seemed to have flat scaling trends.

Citations (4)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.