Papers
Topics
Authors
Recent
2000 character limit reached

Towards Algorithmic Fidelity: Mental Health Representation across Demographics in Synthetic vs. Human-generated Data

Published 25 Mar 2024 in cs.AI, cs.CL, and cs.CY | (2403.16909v1)

Abstract: Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we need an understanding of how different demographics are represented in it. In our paper, we analyze the potential of producing synthetic data using GPT-3 by exploring the various stressors it attributes to different race and gender combinations, to provide insight for future researchers looking into using LLMs for data generation. Using GPT-3, we develop HEADROOM, a synthetic dataset of 3,120 posts about depression-triggering stressors, by controlling for race, gender, and time frame (before and after COVID-19). Using this dataset, we conduct semantic and lexical analyses to (1) identify the predominant stressors for each demographic group; and (2) compare our synthetic data to a human-generated dataset. We present the procedures to generate queries to develop depression data using GPT-3, and conduct analyzes to uncover the types of stressors it assigns to demographic groups, which could be used to test the limitations of LLMs for synthetic data generation for depression data. Our findings show that synthetic data mimics some of the human-generated data distribution for the predominant depression stressors across diverse demographics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Large language models are few-shot clinical information extractors. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1998–2022, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  2. Using Open-Ended Stressor Responses to Predict Depressive Symptoms across Demographics. ArXiv:2211.07932 [cs].
  3. Gender and Racial Fairness in Depression Research using Social Media. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2932–2949, Online. Association for Computational Linguistics.
  4. Police Encounters as Stressors: Associations with Depression and Anxiety across Race. Socius, 7:2378023121998128. Publisher: SAGE Publications.
  5. Out of One, Many: Using Language Models to Simulate Human Samples. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 819–862. ArXiv:2209.06899 [cs].
  6. A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity.
  7. Multi-Task Learning for Mental Health using Social Media Text. ArXiv:1712.03538 [cs].
  8. Prevalence of depression among adults aged 20 and over: United states, 2013-2016. NCHS data brief, (303):1–8.
  9. Jarrod B. Call and Kevin Shafer. 2018. Gendered manifestations of depression and help seeking among men. American Journal of Men’s Health, 12(1):41–51.
  10. CLPsych 2015 Shared Task: Depression and PTSD on Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, pages 31–39, Denver, Colorado. Association for Computational Linguistics.
  11. A Survey of Race, Racism, and Anti-Racism in NLP. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1905–1925, Online. Association for Computational Linguistics.
  12. Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin, 76:378–382.
  13. Joseph L. Fleiss. 1973. Statistical methods for rates and proportions.
  14. Can language use in social media help in the treatment of severe mental illness? Current research in psychiatry, 1(1):1–4.
  15. Blinded Clinical Ratings of Social Media Data are Correlated with In-Person Clinical Ratings in Participants Diagnosed with Either Depression, Schizophrenia, or Healthy Controls. Psychiatry Research, 294:113496.
  16. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. Journal of Biomedical Informatics, 73:14–29.
  17. Gendered Mental Health Stigma in Masked Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2152–2170, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  18. Cross-cultural differences in language markers of depression online. In Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, pages 78–87, New Orleans, LA. Association for Computational Linguistics.
  19. Differentially private language models for secure data sharing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4860–4873, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  20. Racial and Ethnic Disparities in the Prevalence of Stress and Worry, Mental Health Conditions, and Increased Substance Use Among Adults During the COVID-19 Pandemic — United States, April and May 2020. Morbidity and Mortality Weekly Report, 70(5):162–166.
  21. Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict. Political Analysis, 16(4):372–403. Publisher: Cambridge University Press.
  22. Is a prompt and a few samples all you need? Using GPT-4 for data augmentation in low-resource classification tasks. ArXiv:2304.13861 [physics].
  23. Linguistic inquiry and word count (LIWC2007).
  24. The development and psychometric properties of LIWC2015.
  25. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
  26. Gender differences in depression detection: A comparison of clinician diagnosis and standardized assessment. Psychological Assessment: A Journal of Consulting and Clinical Psychology, 3:609–615.
  27. ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology, 307(2):e230163. Publisher: Radiological Society of North America.
  28. Racial and ethnic disparities in detection and treatment of depression and anxiety among psychiatric and primary health care visits, 1995-2005. Medical Care, 46(7):668–677.
  29. Does synthetic data generation of LLMs help clinical text mining? ArXiv:2303.04360 [cs].
  30. Zijian Wang and David Jurgens. 2018. It’s going to be okay: Measuring Access to Support in Online Communities. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 33–45, Brussels, Belgium. Association for Computational Linguistics.
  31. Using Noisy Self-Reports to Predict Twitter User Demographics. In Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, pages 123–137, Online. Association for Computational Linguistics.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 15 likes about this paper.