Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decoding Complexity: Exploring Human-AI Concordance in Qualitative Coding (2403.06607v1)

Published 11 Mar 2024 in cs.HC

Abstract: Qualitative data analysis provides insight into the underlying perceptions and experiences within unstructured data. However, the time-consuming nature of the coding process, especially for larger datasets, calls for innovative approaches, such as the integration of LLMs. This short paper presents initial findings from a study investigating the integration of LLMs for coding tasks of varying complexity in a real-world dataset. Our results highlight the challenges inherent in coding with extensive codebooks and contexts, both for human coders and LLMs, and suggest that the integration of LLMs into the coding process requires a task-by-task evaluation. We examine factors influencing the complexity of coding tasks and initiate a discussion on the usefulness and limitations of incorporating LLMs in qualitative research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023).
  2. A qualitative approach to HCI research. (2008).
  3. Tiffany Bergin. 2018. An Introduction to Data Analysis: Quantitative, Qualitative and Mixed Methods. An Introduction to Data Analysis (2018), 1–296.
  4. Virginia Braun and Victoria Clarke. 2006. Using Thematic Analysis in Psychology. Qualitative research in psychology 3, 2 (2006), 77–101.
  5. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  6. Jacob Cohen. 1960. A Coefficient of Agreement for Nominal Scales. Educational and psychological measurement 20, 1 (1960), 37–46.
  7. Uwe Flick. 2022. An Introduction to Qualitative Research. An introduction to qualitative research (2022), 1–100.
  8. ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks. arXiv preprint arXiv:2303.15056 (2023).
  9. Survey of Hallucination in Natural Language Generation. Comput. Surveys 55, 12 (2023), 1–38.
  10. Erik Jones and Jacob Steinhardt. 2022. Capturing Failures of Large Language Models via Human Cognitive Biases. Advances in Neural Information Processing Systems 35 (2022), 11785–11799.
  11. Udo Kuckartz. 2019. Qualitative Text Analysis: A Systematic Approach. Compendium for early career researchers in mathematics education (2019), 181–197.
  12. J Richard Landis and Gary G Koch. 1977. The Measurement of Observer Agreement for Categorical Data. biometrics (1977), 159–174.
  13. Reliability and Inter-Rater Reliability in Qualitative Research: Norms and Guidelines for CSCW and HCI Practice. Proc. ACM Hum.-Comput. Interact. (CSCW) (2019).
  14. Mary L McHugh. 2012. Interrater reliability: the kappa statistic. Biochemia medica 22, 3 (2012), 276–282.
  15. Sharan B Merriam and Elizabeth J Tisdell. 2015. Qualitative Research: A Guide to Design and Implementation. John Wiley & Sons.
  16. OpenAI. 2023. Prompt Engineering Guide. Retrieved 2024-02-02 from https://platform.openai.com/docs/guides/prompt-engineering
  17. OpenAI. 2024. Pricing. Retrieved 2024-02-23 from https://openai.com/pricing
  18. Different Researchers, Different Results? Analyzing the Influence of Researcher Experience and Data Type During Qualitative Analysis of an Interview and Survey Study on Security Advice. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–21.
  19. C.B. Seaman. 1999. Qualitative methods in empirical studies of software engineering. IEEE Transactions on Software Engineering 25, 4 (1999), 557–572. https://doi.org/10.1109/32.799955
  20. Gemini: A Family of Highly Capable Multimodal Models. arXiv preprint arXiv:2312.11805 (2023).
  21. Petter Törnberg. 2023. ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning. arXiv preprint arXiv:2304.06588 (2023).
  22. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv preprint arXiv:2307.09288 (2023).
  23. Safary Wa-Mbaleka. 2020. The Researcher as an Instrument. In Computer Supported Qualitative Research, António Pedro Costa, Luís Paulo Reis, and António Moreira (Eds.). Springer International Publishing, Cham, 33–41.
  24. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv preprint arXiv:2302.11382 (2023).
  25. Can Large Language Models Transform Computational Social Science? arXiv preprint arXiv:2305.03514 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Elisabeth Kirsten (4 papers)
  2. Annalina Buckmann (3 papers)
  3. Abraham Mhaidli (3 papers)
  4. Steffen Becker (26 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets