关键词: Angelman syndrome Beckwith–Wiedemann syndrome Congenital disease Diagnosis Machine learning Methylation Prader–Willi syndrome Russell–Silver syndrome Silver–Russell syndrome

Mesh : Humans Genomic Imprinting DNA Methylation Beckwith-Wiedemann Syndrome / diagnosis genetics Silver-Russell Syndrome / diagnosis genetics Supervised Machine Learning

来  源:   DOI:10.1186/s12859-024-05673-1   PDF(Pubmed)

Abstract:
BACKGROUND: DNA methylation is one of the most stable and well-characterized epigenetic alterations in humans. Accordingly, it has already found clinical utility as a molecular biomarker in a variety of disease contexts. Existing methods for clinical diagnosis of methylation-related disorders focus on outlier detection in a small number of CpG sites using standardized cutoffs which differentiate healthy from abnormal methylation levels. The standardized cutoff values used in these methods do not take into account methylation patterns which are known to differ between the sexes and with age.
RESULTS: Here we profile genome-wide DNA methylation from blood samples drawn from within a cohort composed of healthy controls of different age and sex alongside patients with Prader-Willi syndrome (PWS), Beckwith-Wiedemann syndrome, Fragile-X syndrome, Angelman syndrome, and Silver-Russell syndrome. We propose a Generalized Additive Model to perform age and sex adjusted outlier analysis of around 700,000 CpG sites throughout the human genome. Utilizing z-scores among the cohort for each site, we deployed an ensemble based machine learning pipeline and achieved a combined prediction accuracy of 0.96 (Binomial 95% Confidence Interval 0.868[Formula: see text]0.995).
CONCLUSIONS: We demonstrate a method for age and sex adjusted outlier detection of differentially methylated loci based on a large cohort of healthy individuals. We present a custom machine learning pipeline utilizing this outlier analysis to classify samples for potential methylation associated congenital disorders. These methods are able to achieve high accuracy when used with machine learning methods to classify abnormal methylation patterns.
摘要:
背景:DNA甲基化是人类中最稳定且特征最明确的表观遗传改变之一。因此,它已经在各种疾病背景下作为分子生物标志物发现了临床应用。用于甲基化相关病症的临床诊断的现有方法集中于使用标准化截止值在少量CpG位点中的异常检测,所述标准化截止值将健康与异常甲基化水平区分开。在这些方法中使用的标准化截止值不考虑已知在性别之间和随年龄而不同的甲基化模式。
结果:在这里,我们对来自不同年龄和性别的健康对照以及Prader-Willi综合征(PWS)患者的血液样本的全基因组DNA甲基化进行了分析。Beckwith-Wiedemann综合征,脆性X综合征,Angelman综合征,和Silver-Russell综合征.我们提出了一个广义加性模型来对整个人类基因组中约700,000个CpG位点进行年龄和性别调整的离群值分析。利用每个站点的队列中的z分数,我们部署了基于集成的机器学习管道,并实现了0.96的组合预测精度(二项式95%置信区间0.868[公式:见文本]0.995)。
结论:我们展示了一种基于大量健康个体队列的差异甲基化位点的年龄和性别调整异常检测方法。我们提出了一个定制的机器学习管道,利用这种离群值分析对样本进行分类,以确定潜在的甲基化相关的先天性疾病。当与机器学习方法一起用于对异常甲基化模式进行分类时,这些方法能够实现高准确性。
公众号