关键词: SHAP value Shapley additive explanations XGBoost adolescent machine learning mental health predictive model risk behavior suicidal thinking

Mesh : Humans Adolescent Machine Learning Female Male Suicidal Ideation Republic of Korea Algorithms Cohort Studies Adolescent Behavior / psychology Suicide / statistics & numerical data psychology Norway Surveys and Questionnaires Risk Factors Risk-Taking

来  源:   DOI:10.2196/55913   PDF(Pubmed)

Abstract:
BACKGROUND: Suicide is the second-leading cause of death among adolescents and is associated with clusters of suicides. Despite numerous studies on this preventable cause of death, the focus has primarily been on single nations and traditional statistical methods.
OBJECTIVE: This study aims to develop a predictive model for adolescent suicidal thinking using multinational data sets and machine learning (ML).
METHODS: We used data from the Korea Youth Risk Behavior Web-based Survey with 566,875 adolescents aged between 13 and 18 years and conducted external validation using the Youth Risk Behavior Survey with 103,874 adolescents and Norway\'s University National General Survey with 19,574 adolescents. Several tree-based ML models were developed, and feature importance and Shapley additive explanations values were analyzed to identify risk factors for adolescent suicidal thinking.
RESULTS: When trained on the Korea Youth Risk Behavior Web-based Survey data from South Korea with a 95% CI, the XGBoost model reported an area under the receiver operating characteristic (AUROC) curve of 90.06% (95% CI 89.97-90.16), displaying superior performance compared to other models. For external validation using the Youth Risk Behavior Survey data from the United States and the University National General Survey from Norway, the XGBoost model achieved AUROCs of 83.09% and 81.27%, respectively. Across all data sets, XGBoost consistently outperformed the other models with the highest AUROC score, and was selected as the optimal model. In terms of predictors of suicidal thinking, feelings of sadness and despair were the most influential, accounting for 57.4% of the impact, followed by stress status at 19.8%. This was followed by age (5.7%), household income (4%), academic achievement (3.4%), sex (2.1%), and others, which contributed less than 2% each.
CONCLUSIONS: This study used ML by integrating diverse data sets from 3 countries to address adolescent suicide. The findings highlight the important role of emotional health indicators in predicting suicidal thinking among adolescents. Specifically, sadness and despair were identified as the most significant predictors, followed by stressful conditions and age. These findings emphasize the critical need for early diagnosis and prevention of mental health issues during adolescence.
摘要:
背景:自杀是青少年死亡的第二大原因,并且与自杀集群有关。尽管对这种可预防的死亡原因进行了大量研究,重点主要是单一国家和传统的统计方法。
目的:本研究旨在使用跨国数据集和机器学习(ML)开发青少年自杀思维的预测模型。
方法:我们使用韩国青少年风险行为网络调查的数据,对566,875名年龄在13至18岁之间的青少年进行调查,并使用青少年风险行为调查对103,874名青少年进行外部验证,挪威大学国家综合调查对19,574名青少年进行验证。开发了几种基于树的机器学习模型,并对特征重要性和Shapley加性解释值进行分析,以确定青少年自杀思维的危险因素。
结果:在对来自韩国的基于韩国青年风险行为网络的调查数据进行训练时,以95%的CI,XGBoost模型报告的接受者工作特征(AUROC)曲线下面积为90.06%(95%CI89.97-90.16),与其他型号相比,表现出卓越的性能。对于使用美国青年风险行为调查数据和挪威大学国家综合调查的外部验证,XGBoost模型的AUROC分别为83.09%和81.27%,分别。在所有数据集中,XGBoost始终优于AUROC得分最高的其他模型,并被选为最优模型。就自杀思维的预测因素而言,悲伤和绝望的感觉是最有影响力的,占影响的57.4%,其次是压力状态为19.8%。其次是年龄(5.7%),家庭收入(4%),学业成绩(3.4%),性别(2.1%),和其他人,各贡献不到2%。
结论:本研究通过整合来自3个国家的不同数据集来使用ML来解决青少年自杀问题。研究结果强调了情绪健康指标在预测青少年自杀思维中的重要作用。具体来说,悲伤和绝望被认为是最重要的预测因素,其次是压力条件和年龄。这些发现强调了青春期早期诊断和预防心理健康问题的迫切需要。
公众号