Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 74 tok/s

Gemini 2.5 Pro 37 tok/s Pro

GPT-5 Medium 36 tok/s Pro

GPT-5 High 37 tok/s Pro

GPT-4o 104 tok/s Pro

Kimi K2 184 tok/s Pro

GPT OSS 120B 448 tok/s Pro

Claude Sonnet 4.5 32 tok/s Pro

2000 character limit reached

Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree (2410.12217v1)

Published 16 Oct 2024 in cs.CL

Abstract: When annotators disagree, predicting the labels given by individual annotators can capture nuances overlooked by traditional label aggregation. We introduce three approaches to predicting individual annotator ratings on the toxicity of text by incorporating individual annotator-specific information: a neural collaborative filtering (NCF) approach, an in-context learning (ICL) approach, and an intermediate embedding-based architecture. We also study the utility of demographic information for rating prediction. NCF showed limited utility; however, integrating annotator history, demographics, and survey information permits both the embedding-based architecture and ICL to substantially improve prediction accuracy, with the embedding-based architecture outperforming the other methods. We also find that, if demographics are predicted from survey information, using these imputed demographics as features performs comparably to using true demographic data. This suggests that demographics may not provide substantial information for modeling ratings beyond what is captured in survey responses. Our findings raise considerations about the relative utility of different types of annotator information and provide new approaches for modeling annotators in subjective NLP tasks.

Summary

The paper demonstrates that embedding-based methods capture annotator nuances effectively, achieving a mean absolute error of 0.61.
It develops three methodologies—neural collaborative filtering, in-context learning, and embedding-based strategies—to integrate individual annotator ratings with text data.
The study highlights a shift from demographic-centered models to survey-based annotator profiling, raising ethical considerations and data privacy concerns.

Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree

This paper introduces novel approaches to improve toxicity prediction in text using individual annotator ratings, addressing the challenges posed by disagreements among annotators in subjective NLP tasks. Traditional approaches often aggregate labels through majority voting, potentially discarding crucial nuances present in individual annotator judgments. This research puts forth three methodologies: a neural collaborative filtering (NCF) approach, an in-context learning (ICL) approach, and an embedding-based strategy. These approaches leverage annotator-specific information, such as demographics and survey data, to enhance prediction accuracy.

Methodology

Neural Collaborative Filtering (NCF): This approach integrates annotator data with the text through a hybrid neural architecture to predict toxicity ratings. Despite its potential, the NCF did not outperform baseline models, as the learned embeddings didn't capture significant interactions between annotator behaviors and text.
Embedding-Based Architecture: By utilizing annotated information alongside text embeddings, this approach emerged as the most effective. The embedding-based method achieved the highest accuracy, demonstrating the effectiveness of incorporating annotator histories and preferences alongside textual data.
In-Context Learning (ICL): Prompting LLMs with contextual annotator data showed improved accuracy over baseline models, though it did not surpass the embedding method. Prominent LLMs like Mistral and GPT-3.5 were assessed here, and their performance indicates the utility of context in enhancing model understanding and predictions.

Results

Among the tested approaches, the embedding-based model achieved the lowest mean absolute error (MAE) of 0.61, outperforming NCF and ICL methods. The results emphasized that demographic information, though initially useful, might be less critical when rich survey response data is available. Predicted demographics derived from survey data showed comparable performance, suggesting that survey responses capture essential annotator characteristics beyond demographics.

Implications and Future Work

This research provides valuable insights into enhancing the predictive capabilities of NLP models in subjective contexts by modeling annotator-specific preferences. It points towards a shift from demographic-centered modeling to preference-based insights that can be derived from survey responses.

The implications of this paper extend into the broader domain of AI and ethics, particularly concerning the privacy risks associated with demographic inference from seemingly innocuous data. As the models can converge on similar performance without explicit demographic data, this raises questions about consent and data protection in AI research.

Future research should address these privacy concerns, explore ways to mitigate bias, and consider the ethical ramifications of proxy demographics. Additionally, advancing scalability and performance across diverse cultural contexts will enhance the practical applicability of these models.

Conclusion

The findings underscore the potential of embedding-based approaches in capturing annotator nuance, signaling a step forward in handling subjective NLP tasks. The paper lays the groundwork for future explorations into more ethical and efficient modeling strategies, advocating for personalized prediction mechanisms responsive to individual annotator preferences.