关键词: Behavioural phenotypes Genetic syndromes Intellectual disability Machine learning

Mesh : Male Humans Adolescent Child Female Intellectual Disability Autism Spectrum Disorder Cohort Studies Cross-Sectional Studies Genomics Machine Learning

来  源:   DOI:10.1186/s13229-023-00549-2   PDF(Pubmed)

Abstract:
Genomic conditions can be associated with developmental delay, intellectual disability, autism spectrum disorder, and physical and mental health symptoms. They are individually rare and highly variable in presentation, which limits the use of standard clinical guidelines for diagnosis and treatment. A simple screening tool to identify young people with genomic conditions associated with neurodevelopmental disorders (ND-GCs) who could benefit from further support would be of considerable value. We used machine learning approaches to address this question.
A total of 493 individuals were included: 389 with a ND-GC, mean age = 9.01, 66% male) and 104 siblings without known genomic conditions (controls, mean age = 10.23, 53% male). Primary carers completed assessments of behavioural, neurodevelopmental and psychiatric symptoms and physical health and development. Machine learning techniques (penalised logistic regression, random forests, support vector machines and artificial neural networks) were used to develop classifiers of ND-GC status and identified limited sets of variables that gave the best classification performance. Exploratory graph analysis was used to understand associations within the final variable set.
All machine learning methods identified variable sets giving high classification accuracy (AUROC between 0.883 and 0.915). We identified a subset of 30 variables best discriminating between individuals with ND-GCs and controls which formed 5 dimensions: conduct, separation anxiety, situational anxiety, communication and motor development.
This study used cross-sectional data from a cohort study which was imbalanced with respect to ND-GC status. Our model requires validation in independent datasets and with longitudinal follow-up data for validation before clinical application.
In this study, we developed models that identified a compact set of psychiatric and physical health measures that differentiate individuals with a ND-GC from controls and highlight higher-order structure within these measures. This work is a step towards developing a screening instrument to identify young people with ND-GCs who might benefit from further specialist assessment.
摘要:
背景:基因组条件可能与发育迟缓有关,智力残疾,自闭症谱系障碍,以及身心健康症状。它们在表现上是罕见的和高度可变的,这限制了诊断和治疗的标准临床指南的使用。一种简单的筛查工具来识别与神经发育障碍(ND-GC)相关的基因组疾病的年轻人,他们可以从进一步的支持中受益,这将具有相当大的价值。我们使用机器学习方法来解决这个问题。
方法:共纳入493名个体:389名ND-GC,平均年龄=9.01,66%男性)和104个兄弟姐妹,没有已知的基因组条件(对照,平均年龄=10.23,53%男性)。主要照顾者完成了行为评估,神经发育和精神症状以及身体健康和发育。机器学习技术(惩罚逻辑回归,随机森林,支持向量机和人工神经网络)用于开发ND-GC状态的分类器,并确定了具有最佳分类性能的有限变量集。使用探索性图形分析来理解最终变量集中的关联。
结果:所有机器学习方法都确定了具有较高分类精度的变量集(AUROC在0.883和0.915之间)。Weidentifiedasubsetof30variablesbestdifferencebetweenindividualswithND-GCandcontrolswhichformed5dimensions:conduct,分离焦虑,情境焦虑,通信和电机发展。
结论:本研究使用了一项队列研究的横断面数据,该研究在ND-GC状态方面不平衡。我们的模型需要在独立的数据集和纵向随访数据中进行验证,以便在临床应用之前进行验证。
结论:在这项研究中,我们开发的模型确定了一组紧凑的精神和身体健康测量值,这些测量值可将患有ND-GC的个体与对照组区分开来,并突出了这些测量值的高阶结构.这项工作是朝着开发筛查工具的方向迈出的一步,以识别可能从进一步的专家评估中受益的患有ND-GC的年轻人。
公众号