Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collections Accurately and Affordably (1806.00755v3)

Published 3 Jun 2018 in cs.IR

Abstract: Crowdsourcing offers an affordable and scalable means to collect relevance judgments for IR test collections. However, crowd assessors may show higher variance in judgment quality than trusted assessors. In this paper, we investigate how to effectively utilize both groups of assessors in partnership. We specifically investigate how agreement in judging is correlated with three factors: relevance category, document rankings, and topical variance. Based on this, we then propose two collaborative judging methods in which a portion of the document-topic pairs are assessed by in-house judges while the rest are assessed by crowd-workers. Experiments conducted on two TREC collections show encouraging results when we distribute work intelligently between our two groups of assessors.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mucahid Kutlu (23 papers)
  2. Tyler McDonnell (4 papers)
  3. Aashish Sheshadri (2 papers)
  4. Tamer Elsayed (22 papers)
  5. Matthew Lease (57 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.