Estimate the fraction of apathetic votes on Chatbot Arena
Determine the fraction r of apathetic users on Chatbot Arena who submit random or low-quality preference votes, in order to quantify how prevalent apathetic voting is in the platform’s human preference dataset and to assess its impact on leaderboard reliability.
References
Note that there are no existing studies characterizing the incentives or behaviors of an average user on open platforms like Chatbot Arena. Therefore, we have no way of estimating the fraction r of apathetic.
— Challenges in Trustworthy Human Evaluation of Chatbots
(2412.04363 - Zhao et al., 2024) in Section 3.1 (Apathetic Voting), Results paragraph after Table 1