Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media

Kavli Affiliate: Dan Luo

| First 5 Authors: Hongzhi Qi, Qing Zhao, Jianqiang Li, Changwei Song, Wei Zhai

| Summary:

On social media, users often express their personal feelings, which may
exhibit cognitive distortions or even suicidal tendencies on certain specific
topics. Early recognition of these signs is critical for effective
psychological intervention. In this paper, we introduce two novel datasets from
Chinese social media: SOS-HL-1K for suicidal risk classification and
SocialCD-3K for cognitive distortions detection. The SOS-HL-1K dataset
contained 1,249 posts and SocialCD-3K dataset was a multi-label classification
dataset that containing 3,407 posts. We propose a comprehensive evaluation
using two supervised learning methods and eight large language models (LLMs) on
the proposed datasets. From the prompt engineering perspective, we experimented
with two types of prompt strategies, including four zero-shot and five few-shot
strategies. We also evaluated the performance of the LLMs after fine-tuning on
the proposed tasks. The experimental results show that there is still a huge
gap between LLMs relying only on prompt engineering and supervised learning. In
the suicide classification task, this gap is 6.95% points in F1-score, while in
the cognitive distortion task, the gap is even more pronounced, reaching 31.53%
points in F1-score. However, after fine-tuning, this difference is
significantly reduced. In the suicide and cognitive distortion classification
tasks, the gap decreases to 4.31% and 3.14%, respectively. This research
highlights the potential of LLMs in psychological contexts, but supervised
learning remains necessary for more challenging tasks. All datasets and code
are made available.

| Search Query: ArXiv Query: search_query=au:”Dan Luo”&id_list=&start=0&max_results=3