Papers

Topics

Authors

Recent

View all

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 78 tok/s

Gemini 2.5 Pro 46 tok/s Pro

GPT-5 Medium 12 tok/s Pro

GPT-5 High 14 tok/s Pro

GPT-4o 89 tok/s Pro

Kimi K2 212 tok/s Pro

GPT OSS 120B 472 tok/s Pro

Claude Sonnet 4 39 tok/s Pro

2000 character limit reached

Bayesian Nonparametric Models for Multiple Raters: a General Statistical Framework (2410.21498v3)

Published 28 Oct 2024 in stat.ME and stat.AP

Abstract: Rating procedure is crucial in many applied fields (e.g., educational, clinical, emergency). It implies that a rater (e.g., teacher, doctor) rates a subject (e.g., student, doctor) on a rating scale. Given raters variability, several statistical methods have been proposed for assessing and improving the quality of ratings. Model estimation in the presence of heterogeneity has been one of the recent challenges in this research line. Consequently, several methods have been proposed to address this issue under a parametric multilevel modelling framework, in which strong distributional assumptions are made. We propose a more flexible model under the Bayesian nonparametric (BNP) framework, in which most of those assumptions are relaxed. By eliciting hierarchical discrete nonparametric priors, the model accommodates clusters among raters and subjects, naturally accounts for heterogeneity, and improves estimates accuracy. We propose a general BNP heteroscedastic framework to analyse continuous and coarse rating data and possible latent differences among subjects and raters. The estimated densities are used to make inferences about the rating process and the quality of the ratings. By exploiting a stick-breaking representation of the Dirichlet Process, a general class of Intraclass Correlation Coefficient (ICC) indices might be derived for these models. Our method allows us to independently identify latent similarities between subjects and raters and can be applied in precise education to improve personalised teaching programs or interventions. Theoretical results about the ICC are provided together with computational strategies. Simulations and a real-world application are presented, and possible future directions are discussed.