Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare (2110.15621v1)

Published 29 Oct 2021 in cs.CL

Abstract: Mental health is a critical issue in modern society, and mental disorders could sometimes turn to suicidal ideation without adequate treatment. Early detection of mental disorders and suicidal ideation from social content provides a potential way for effective social intervention. Recent advances in pretrained contextualized language representations have promoted the development of several domain-specific pretrained models and facilitated several downstream applications. However, there are no existing pretrained LLMs for mental healthcare. This paper trains and release two pretrained masked LLMs, i.e., MentalBERT and MentalRoBERTa, to benefit machine learning for the mental healthcare research community. Besides, we evaluate our trained domain-specific models and several variants of pretrained LLMs on several mental disorder detection benchmarks and demonstrate that language representations pretrained in the target domain improve the performance of mental health detection tasks.

An Analysis and Evaluation of MentalBERT: Pretrained LLMs for Mental Healthcare

The paper "MentalBERT: Publicly Available Pretrained LLMs for Mental Healthcare" presents the development and evaluation of domain-specific pretrained masked LLMs, specifically MentalBERT and MentalRoBERTa, aimed at improving the detection of mental health disorders through social media content analysis. Given the prevalence of mental health issues and the potential for predictive interventions, this research finds its importance in leveraging advanced LLMs specifically tailored for mental healthcare applications.

Overview

The research identifies a significant gap in the availability of domain-specific LLMs for mental healthcare, despite the existence of models for various other domains such as biomedical and clinical contexts. As a solution, the authors pretrained MentalBERT and MentalRoBERTa using social media data sourced from forums dedicated to mental health discussions on platforms like Reddit. Unlike existing models that either generalize or focus on unrelated domains, MentalBERT and MentalRoBERTa are intended to capture the nuanced language often associated with discussions and self-expressions of mental health conditions.

Methodology

The paper uses a domain-adaptive pretraining approach. The MentalBERT models were initialized from existing BERT and RoBERTa checkpoints, benefitting from broad-domain knowledge and then adapted to mental health domain data, thus optimizing computational resources. The pretraining was conducted over a corpus of 13.6 million sentences focused on mental health topics, employing a setup involving significant computational resources over multiple GPU nodes.

Standard transformer-based architectures were utilized with a masked LLMing objective to learn effective text representations. The fine-tuning involved multiple downstream tasks such as binary mental disorder detection and multi-class classification related to different mental disorders, evaluated across multiple benchmark datasets from Reddit and Twitter.

Results

The empirical evaluations highlight that both MentalBERT and MentalRoBERTa outperformed existing baselines like general BERT, RoBERTa, and other domain-specific models like BioBERT and ClinicalBERT across numerous benchmarks. Notably, MentalRoBERTa consistently provided superior performance in detecting signs of depression, suicide ideation, stress, and anxiety across different datasets. This strongly indicates that continued pretraining on a domain-specific corpus substantially enhances performance on related detection tasks.

Implications and Future Work

The implications of MentalBERT and MentalRoBERTa are manifold. Practically, their application could facilitate early intervention strategies in mental health support through automated monitoring systems, providing a valuable tool for clinicians and social workers aiming for early detection and proactive engagement.

However, potential biases and ethical concerns arise from deploying such systems, especially regarding interpretability and fairness. Future research directions are set to explore multilingual extensions of the existing models, addressing the limitations of coverage beyond the English language and ensuring broader applicability.

Conclusion

This paper is a critical step forward in bridging the gap between advanced NLP techniques and practical mental health applications. By offering domain-adapted models specifically curated for mental healthcare, the authors open avenues for impactful applications in both research and healthcare settings. Moreover, their thoughtful maintenance of public data privacy underscores a commitment to ethical research practices, providing a responsible framework for future explorations in sensitive areas like mental health. The release of MentalBERT and MentalRoBERTa on platforms like Huggingface further amplifies their potential for widespread academic and clinical use, fostering ongoing innovation and collaboration in this critical field.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shaoxiong Ji (39 papers)
  2. Tianlin Zhang (17 papers)
  3. Luna Ansari (1 paper)
  4. Jie Fu (229 papers)
  5. Prayag Tiwari (41 papers)
  6. Erik Cambria (136 papers)
Citations (193)