Papers
Topics
Authors
Recent
2000 character limit reached

When Incentives Backfire, Data Stops Being Human

Published 11 Feb 2025 in cs.CY, cs.AI, cs.CL, cs.CV, cs.HC, and cs.LG | (2502.07732v2)

Abstract: Progress in AI has relied on human-generated data, from annotator marketplaces to the wider Internet. However, the widespread use of LLMs now threatens the quality and integrity of human-generated data on these very platforms. We argue that this issue goes beyond the immediate challenge of filtering AI-generated content -- it reveals deeper flaws in how data collection systems are designed. Existing systems often prioritize speed, scale, and efficiency at the cost of intrinsic human motivation, leading to declining engagement and data quality. We propose that rethinking data collection systems to align with contributors' intrinsic motivations -- rather than relying solely on external incentives -- can help sustain high-quality data sourcing at scale while maintaining contributor trust and long-term participation.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.