Gold Standard for Expert Ranking: A Survey on the XWiki Dataset (1603.03809v1)
Abstract: We are designing an automated technique to find and recommend experts for helping in Requirements Engineering tasks, which can be done by ranking the available people by level of expertise. For evaluating the correctness of the rankings produced by the automated technique, we want to compare them to a gold standard. In this work, we ask external people to look at a set of discussions and to rank their participants, before to evaluate the reliability of these rankings to serve as a gold standard. We describe the setting and running of this survey, the method used to build the gold standard from the rankings of the subjects, and the analysis of the results to obtain and validate this gold standard. Through the analysis of the results, we conclude that we obtained a reasonable gold standard although we lack evidences to support its total correctness. We also made the interesting observation that the most reliable subjects build the least ordered rankings (i.e. has few ranks with several people per rank), which goes against the usual assumptions of Information Retrieval measures.