Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automatic detection of problem-gambling signs from online texts using large language models (2312.00804v1)

Published 24 Nov 2023 in cs.CL and cs.LG

Abstract: Problem gambling is a major public health concern and is associated with profound psychological distress and economic problems. There are numerous gambling communities on the internet where users exchange information about games, gambling tactics, as well as gambling-related problems. Individuals exhibiting higher levels of problem gambling engage more in such communities. Online gambling communities may provide insights into problem-gambling behaviour. Using data scraped from a major German gambling discussion board, we fine-tuned a LLM, specifically a Bidirectional Encoder Representations from Transformers (BERT) model, to predict signs of problem-gambling from forum posts. Training data were generated by manual annotation and by taking into account diagnostic criteria and gambling-related cognitive distortions. Using k-fold cross-validation, our models achieved a precision of 0.95 and F1 score of 0.71, demonstrating that satisfactory classification performance can be achieved by generating high-quality training material through manual annotation based on diagnostic criteria. The current study confirms that a BERT-based model can be reliably used on small data sets and to detect signatures of problem gambling in online communication data. Such computational approaches may have potential for the detection of changes in problem-gambling prevalence among online users.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. American Psychiatric Association. Diagnostisches und statistisches Manual psychischer Störungen–DSM-5 (R). Hogrefe Verlag; 2014.
  2. Fong TW. The biopsychosocial consequences of pathological gambling. Psychiatry (Edgmont). 2005;2(3):22.
  3. Goodie AS, Fortune EE. Measuring cognitive distortions in pathological gambling: review and meta-analyses. Psychology of Addictive Behaviors. 2013;27(3):730.
  4. Risk factors for problematic gambling: A critical literature review. Journal of Gambling Studies. 2009;25(1):67-92.
  5. World Health Organization. ICD-11: International classification of diseases 11th revision. Retrieved September. 2018;6:2021.
  6. Griffiths M. Problem gambling in Europe: what do we know? Casino & Gaming International. 2010;6(2):81-4.
  7. Sundqvist K, Rosendahl I. Problem gambling and psychiatric comorbidity—risk and temporal Sequencing among women and men: Results from the Swelogs case–control study. Journal of Gambling Studies. 2019;35(3):757-71.
  8. Lancet T. Problem gambling is a public health concern; 2017.
  9. Staatsvertrag zum Glücksspielwesen in Deutschland (Glücksspielstaatsvertrag – GlüStV);. Available from: https://gluecksspiel.uni-hohenheim.de/fileadmin/einrichtungen/gluecksspiel/Staatsvertrag/GlueStV.pdf.
  10. Staatsvertrag zur Neuregulierung des Glücksspielwesens in Deutschland (Glücksspielstaatsvertrag 2021 - GlüStV 2021);. Available from: https://gesetze.berlin.de/bsbe/document/aiz-jlr-Gl%C3%BCStVtrBE2021rahmen%4020210701.
  11. Krumpal I. Determinants of social desirability bias in sensitive surveys: a literature review. Quality & Quantity. 2013;47(4):2025-47.
  12. Answering autobiographical questions: The impact of memory and inference on surveys. Science. 1987;236(4798):157-61.
  13. Griffiths MD. The use of online methodologies in data collection for gambling and gaming addictions. International journal of mental health and addiction. 2010;8(1):8-20.
  14. Online identities and social influence in social media gambling exposure: A four-country study on young people. Telematics and Informatics. 2021;60:101582.
  15. Excessive gambling and online gambling communities. Journal of Gambling Studies. 2018;34(4):1313-25.
  16. Loneliness and online gambling-community participation of young social media users. Computers in Human Behavior. 2019;95:136-45.
  17. Lesieur HR, Blume SB. The South Oaks Gambling Screen (SOGS): a new instrument for the identification of pathological gamblers. Am J Psychiatry. 1987;144(9):1184-8.
  18. Caputo A. Sharing problem gamblers’ experiences: A text analysis of gambling stories via online forum. Mediterranean Journal of Clinical Psychology. 2015;3(1).
  19. Im EO, Chee W. An online forum as a qualitative research method: practical issues. Nursing Research. 2006;55(4):267.
  20. Chancellor S, De Choudhury M. Methods in predictive techniques for mental health status on social media: a critical review. NPJ Digital Medicine. 2020;3(1):1-11.
  21. Evaluating the predictability of medical conditions from social media posts. PLOS ONE. 2019;14(6):e0215476.
  22. Garner H. Engineering in genomics: the emerging in-silico scientist; how text-based bioinformatics is bridging biology and artificial intelligence. IEEE Engineering in Medicine and Biology Magazine. 2004;23(2):87-93.
  23. Clinical natural language processing in languages other than english: opportunities and challenges. Journal of Biomedical Semantics. 2018;9(1):1-13.
  24. Pennebaker JW, King LA. Linguistic styles: language use as an individual difference. Journal of Personality and Social Psychology. 1999;77(6):1296.
  25. Overview of eRisk at CLEF 2021: Early Risk Prediction on the Internet (Extended Overview). CLEF (Working Notes). 2021:864-87.
  26. Overview of erisk 2022: Early risk prediction on the internet. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 13th International Conference of the CLEF Association, CLEF 2022, Bologna, Italy, September 5–8, 2022, Proceedings. Springer; 2022. p. 233-56.
  27. Depression and self-harm risk assessment in online forums. arXiv preprint arXiv:170901848. 2017.
  28. Losada DE, Crestani F. A test collection for research on depression and language use. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 7th International Conference of the CLEF Association, CLEF 2016, Évora, Portugal, September 5-8, 2016, Proceedings 7. Springer; 2016. p. 28-39.
  29. UNSL at eRisk 2021: A Comparison of Three Early Alert Policies for Early Risk Detection. In: CLEF (working notes); 2021. p. 992-1021.
  30. UNED-NLP at eRisk 2022: Analyzing gambling disorders in social media using approximate nearest neighbors. Proceedings of the Working Notes of CLEF. 2022.
  31. Early risk detection of pathological gambling, self-harm and depression using BERT. arXiv preprint arXiv:210616175. 2021.
  32. An end-to-end set transformer for user-level classification of depression and gambling disorder. arXiv preprint arXiv:220700753. 2022.
  33. German BERT. URL: https://deepset ai/german-bert. 2019.
  34. Van Rossum G, Drake FL. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace; 2009.
  35. Richardson L. Beautiful soup documentation. April. 2007.
  36. Klawonn T. Urheberrechtliche Grenzen des Web Scrapings (Web Scraping under German Copyright Law). Available at SSRN 3491192. 2019.
  37. Hipp RD. SQLite; 2020. Available from: https://www.sqlite.org/index.html.
  38. Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc.; 2009.
  39. Raylu N, Oei TP. The Gambling Related Cognitions Scale (GRCS): Development, confirmatory factor validation and psychometric properties. Addiction. 2004;99(6):757-69.
  40. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. 2018.
  41. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:190102860. 2019.
  42. A comparison of transformer and recurrent neural networks on multilingual neural machine translation. arXiv preprint arXiv:180606957. 2018.
  43. Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations; 2020. p. 38-45.
  44. Attention is all you need. Advances in Neural Information Processing Systems. 2017;30.
  45. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; 2019. p. 8024-35. Available from: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  46. dbmdz/bert-base-german-uncased · Hugging Face — huggingface.co;. [Accessed 23-11-2023]. https://huggingface.co/dbmdz/bert-base-german-uncased.
  47. Hugging Face. BERT For Sequence Classification;. Accessed: 2023-02-28. Available from: https://huggingface.co/docs/transformers/v4.26.0/en/model_doc/bert#transformers.BertForSequenceClassification.
  48. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. Journal of Machine Learning Research. 2017;18(17):1-5. Available from: http://jmlr.org/papers/v18/16-365.html.
  49. Gerbil–benchmarking named entity recognition and linking consistently. Semantic Web. 2018;9(5):605-25.
  50. Karches KE. Against the iDoctor: why artificial intelligence should not replace physician judgment. Theoretical Medicine and Bioethics. 2018;39(2):91-110.
  51. Democratising or disrupting diagnosis? Ethical issues raised by the use of AI tools for rare disease diagnosis. SSM-Qualitative Research in Health. 2023;3:100240.
  52. Emoji Sentiment Roles for Sentiment Analysis: A Case Study in Arabic Texts. In: Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP); 2022. p. 346-55.
  53. EmoTag–Towards an emotion-based analysis of emojis. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019); 2019. p. 1094-103.
  54. Zhu X, Wu X. Class noise vs. attribute noise: A quantitative study. Artificial intelligence review. 2004;22:177-210.
  55. Predicting mental health illness using machine learning algorithms. In: Journal of Physics: Conference Series. vol. 2161. IOP Publishing; 2022. p. 012021.
  56. Bias in algorithms of AI systems developed for COVID-19: A scoping review. Journal of Bioethical Inquiry. 2022;19(3):407-19.
  57. Stigma, biomarkers, and algorithmic bias: recommendations for precision behavioral health with artificial intelligence. JAMIA open. 2020;3(1):9-15.
  58. Eysenbach G, Till JE. Ethical issues in qualitative research on internet communities. Bmj. 2001;323(7321):1103-5.
  59. A primer on theory-driven web scraping: Automatic extraction of big data from the Internet for use in psychological research. Psychological methods. 2016;21(4):475.
  60. Banz M. Glücksspielverhalten und Glücksspielsucht in Deutschland. Ergebnisse des Surveys 2019 und Trends. BzgA-Forschungsbericht. Köln: Bundeszentrale für gesundheitliche Aufklärung. doi: 10.17623/BZGA:225-GS-SY19-1.0; 2019.
  61. Gainsbury SM. Online gambling addiction: the relationship between internet gambling and disordered gambling. Current Addiction Reports. 2015;2(2):185-93.
  62. Price A. Online gambling in the midst of COVID-19: a nexus of mental health concerns, substance use and financial stress. International Journal of Mental Health and Addiction. 2020:1-18.
  63. Assessing online gaming and pornography consumption patterns during COVID-19 isolation using an online survey: Highlighting distinct avenues of problematic internet behavior. Addictive Behaviors. 2021;123:107044.
  64. Early Detection of Signs of Pathological Gambling, Self-Harm and Depression through Topic Extraction and Neural Networks. In: CLEF (working notes); 2021. p. 1031-45.

Summary

We haven't generated a summary for this paper yet.