Responsible AI Researcher on the Limits of AI for Mental Health
What can AI do for mental health? Even asking the question invites controversy, especially at a time when suicide rates are reaching record highs. Add to that the complex web of factors—both clinical and non-clinical—that contribute to mental health, and you can begin to see how thorny the subject is. Still, AI’s true superpower lies in its ability to survey and process lots of data, and what is social media other than a reservoir of data?
Annika Marie Schoene, a research scientist and member of the Responsible AI practice at the Institute for Experiential AI, has been studying how the social determinants of health drive adverse health outcomes, particularly suicide. She has also looked at the many ways AI tools like natural language processing (NLP) and sentiment analysis are used to parse the more subtle, non-clinical contributors to mental illness. In an Expeditions in Experiential AI Seminar on June 12, Annika shared some of those findings.
Responsible AI for Suicide Prevention: Challenges
Suicide is the fourth leading cause of death for people aged 15-29 worldwide, and in many countries, including the U.S., those rates are trending upward. The crisis has led many to search for ways technology can help understand or even prevent suicide. A variety of social media companies, clinical organizations, and startups have been developing tools to detect mental health disorders online. But, as Annika explained in her talk, there are lots of challenges from a data science standpoint.
For one, risk assessments for suicidal ideation are rarely binary and often fluctuate; you can’t diagnose a person from a single Twitter post. Two, suicide is a relatively rare event, which can lead to a high false positive rate when using predictive algorithms. Three, there’s not much interdisciplinary research in the area; papers are often either “very technically novel and not clinically grounded” or “not technically novel but very clinically grounded.” Four, when it comes to labeling content on social media, there is no ground truth, so annotators have to make judgments based on subjective opinions. And finally, datasets collected using suicide-related keyword dictionaries usually don’t contain posts with genuine suicidal ideation, and you can’t well study suicide if you can’t distinguish between genuine intent and humorous or idiomatic contexts.
“Of course, this is somewhat limited in itself,” Annika added, “as a tweet could also contain more than one emotion or not really neatly fit into one of those categories. And similar to the idea of not knowing ground truth or suicide risk, we don't really know what the intended emotion is.”
Sentiment Analysis: Man vs. Machine
Annika went on to discuss findings from a paper she co-authored with other institute researchers, which found major inconsistencies between emotion labels annotated by humans and those predicted by language models. This is a potentially harmful situation, given how widely available automated sentiment analysis is through platforms like Huggingface.
Moreover, Annika’s paper suggests it may not be possible for AI to parse the finer aspects of human emotions. It is likely necessary for a human-led “quality check” on all AI-driven solutions in such sensitive areas as mental health.
“This led us really to consider that language models for emotion prediction may not be actually capable of finding finer distinctions and granular suicide-related content,” she said. “Our findings also led us to question how useful and credible such techniques are when you want to use emotion features in suicide detection tasks, especially given that this is a really high-risk scenario and a high false positive rate could cause real harm.”
It’s a given that suicide is hard to predict, but few studies account for the social determinants of health when assessing causes of death. This muddles the data down the road and makes it all the more difficult to study. In a meta-analysis, Annika showed how many papers do not include social factors when determining the cause of a suicide—things like socioeconomic issues, housing or food insecurity, legal troubles, and discrimination.
Breaking the Stigma
Annika closed with a discussion of the ethical issues surrounding AI-based suicide-prevention tools. Among the most concerning is that, currently, AI tools have virtually no oversight, regulation, or governance to guide their development. There are no no mandatory ethics or health training programs, and the risk of misclassification is high, while the benefit is marginal.
For these reasons and others, Annika believes AI should not be used as a way to predict suicide. While it is certainly a powerful tool, professionals are needed at every stage to ensure models are used responsibly.
“It's really important to remember that suicide is almost universally stigmatized,” she explained. ”I'm not just talking about the countries where it's illegal, but actually just as a taboo in society. Here in the U.S. too. And of course, we want to prevent artificial intelligence from increasing any kind of stigma or harm, and so there needs to be a really careful consideration around how we deploy such technologies.”
Watch Annika Marie Schoene’s full talk here. You can also learn more about how the institute’s Responsible AI practice helps organizations navigate ethical challenges presented by AI technologies.