Papers
Topics
Authors
Recent
Search
2000 character limit reached

Your Model Is Not Predicting Depression Well And That Is Why: A Case Study of PRIMATE Dataset

Published 1 Mar 2024 in cs.CL | (2403.00438v1)

Abstract: This paper addresses the quality of annotations in mental health datasets used for NLP-based depression level estimation from social media texts. While previous research relies on social media-based datasets annotated with binary categories, i.e. depressed or non-depressed, recent datasets such as D2S and PRIMATE aim for nuanced annotations using PHQ-9 symptoms. However, most of these datasets rely on crowd workers without the domain knowledge for annotation. Focusing on the PRIMATE dataset, our study reveals concerns regarding annotation validity, particularly for the lack of interest or pleasure symptom. Through reannotation by a mental health professional, we introduce finer labels and textual spans as evidence, identifying a notable number of false positives. Our refined annotations, to be released under a Data Use Agreement, offer a higher-quality test set for anhedonia detection. This study underscores the necessity of addressing annotation quality issues in mental health datasets, advocating for improved methodologies to enhance NLP model reliability in mental health assessments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. AmericanĀ Psychiatric Association. 2013. Diagnostic and statistical manual of mental disorders: DSM-5ā„¢ (5th ed.). American Psychiatric Publishing, Inc.
  2. Ethical research protocols for social media health research. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pages 94–102, Valencia, Spain. Association for Computational Linguistics.
  3. Quantifying mental health signals in Twitter. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pages 51–60, Baltimore, Maryland, USA. Association for Computational Linguistics.
  4. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  5. Learning to automate follow-up question generation using process knowledge for depression triage on Reddit posts. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, pages 137–147, Seattle, USA. Association for Computational Linguistics.
  6. Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654.
  7. Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415.
  8. Kurt Kroenke and RobertĀ L Spitzer. 2002. The PHQ-9: a new depression diagnostic and severity measure.
  9. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  10. DavidĀ E Losada and Fabio Crestani. 2016. A test collection for research on depression and language use. In International conference of the cross-language evaluation forum for European languages, pages 28–39. Springer.
  11. StuartĀ A Montgomery and MARIE ƅsberg. 1979. A new depression scale designed to be sensitive to change. The British journal of psychiatry, 134(4):382–389.
  12. Seon-Cheol Park and Daeho Kim. 2020. The centrality of depression and anxiety symptoms in major depressive disorder determined using a network analysis. Journal of affective disorders, 271:19–26.
  13. Inna Pirina and Ƈağrı Ƈƶltekin. 2018. Identifying depression on Reddit: The effect of training data. In Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task, pages 9–12, Brussels, Belgium. Association for Computational Linguistics.
  14. Assessing anhedonia in depression: Potentials and pitfalls. Neuroscience & Biobehavioral Reviews, 65:21–35.
  15. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
  16. Study on mental disorder detection via social media mining. In 2019 4th International conference on computing, communications and security (ICCCS), pages 1–6. IEEE.
  17. University of Tartu. 2018. UT rocket.
  18. Association of symptom network structure with the course of depression. JAMA psychiatry, 72(12):1219–1226.
  19. Understanding anhedonia: A qualitative study exploring loss of interest and pleasure in adolescent depression. European Child & Adolescent Psychiatry, 29:489–499.
  20. Mapping the relationship between anxiety, anhedonia, and depression. Journal of affective disorders, 221:289–296.
  21. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  22. Identifying depressive symptoms from tweets: Figurative language enabled multitask learning framework. In Proceedings of the 28th International Conference on Computational Linguistics, pages 696–709, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  23. Depression and self-harm risk assessment in online forums. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2968–2978, Copenhagen, Denmark. Association for Computational Linguistics.
  24. Ayah Zirikly and Mark Dredze. 2022. Explaining models of mental health via clinically grounded auxiliary tasks. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, pages 30–39, Seattle, USA. Association for Computational Linguistics.
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.