MSCSIS Proceedings - Reitz - UNCW CSIS Proceedings

Background: Suicide is a serious problem that affects individuals all over the world. The suicide rate increased by 30% between 2000 and 2018 [1]. Over the past couple of years, suicide has been in the top 10 leading causes of death [1]. In the year 2020, 3.2 million people planned suicide; out of the 3.2 million people, 1.2 million attempted suicides, and 45,979 died by suicide [1]. Suicide survivors and members surrounding survivors are at high risk of developing suicidal ideation [2]. These people often feel complex emotions involving guilt, shame, anger, and denial [2]. This state often breeds ideal grounds for perceiving stigma [2]. According to Cambridge, stigmatization is the act of treating someone or something unfairly by publicly disapproving of him, her, or it [3]. This perceived stigma around suicide is often a barrier for individuals to reach out and seek treatment. According to the American Psychological Association, patients who received early intervention had 30 % fewer suicide attempts [4]. With that being said, numerous researchers have attempted to detect suicide ideation/attempts within social media posts across multiple platforms [2,7-12]. These researchers used a plethora of artificial intelligence algorithms to accomplish their tasks. Objective: The primary purpose of this study was to develop, compare, and contrast artificial intelligence algorithms to determine which could detect suicide ideation/attempts within social media data. The objective is to gather and annotate data using manual, machine learning, and deep learning algorithms for completeness. Then they were compared to determine which model would give the highest accuracy and best overall confusion matrix while keeping in mind that false positives are more favorable than false negatives. Method: The data for this study was taken from the subreddit r/SuicideWatch using an application programming interface between 2019 and 2020. For annotating the collected data, various methods were used, including manual coding, Support Vector Classifier, K- nearest neighbor, logistic regression, decision tree, stochastic gradient descent, random forest, Naive Bayes, and a temporal convolution neural network. As the datacollection continued; these algorithms were trained on posts ranging from a few hundred to 1,500. Results: At the end of the study, it was determined that the machine learning algorithms performed as well as a human annotator with an accuracy of around 90 %. It was also determined that the scikit-learn Naive Bayes did not outperform the machine learning algorithms, having an accuracy of around 78%. Finally, the non-optimized temporal convolution neural network performed as well as the machine learning algorithms. Conclusion: The results showed that the temporal convolution neural network was the top performer for detecting stigma with a social media post. This temporal convolution neural network can be used instead of manual coding and perform as well as a human coder, reducing the time needed to annotate. It may prove that future work on the temporal convolution neural network can achieve results higher than a human coder.

UNCW MS Computer Science Information Systems Proceedings

Detecting Suicide Ideation/Attempts Within Social Media

Timothy Reitz