Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 185 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Confidence-Diversity Calibration of AI Judgement Enables Reliable Qualitative Coding (2508.02029v1)

Published 4 Aug 2025 in cs.LG and cs.AI

Abstract: LLMs enable qualitative coding at large scale, but assessing the reliability of their output remains challenging in domains where human experts seldom agree. Analysing 5,680 coding decisions from eight state-of-the-art LLMs across ten thematic categories, we confirm that a model's mean self-confidence already tracks inter-model agreement closely (Pearson r=0.82). Adding model diversity-quantified as the normalised Shannon entropy of the panel's votes-turns this single cue into a dual signal that explains agreement almost completely (R2=0.979). The confidence-diversity duo enables a three-tier workflow that auto-accepts 35% of segments with <5% audit-detected error and routes the remainder for targeted human review, cutting manual effort by up to 65%. Cross-domain replication on six public datasets spanning finance, medicine, law and multilingual tasks confirms these gains (kappa improvements of 0.20-0.78). Our results establish a generalisable, evidence-based criterion for calibrating AI judgement in qualitative research.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.