关键词: BACkPAy Bayesian model DNA methylation LIMMA gastric cancer

来  源:   DOI:10.3389/fgene.2021.705708   PDF(Pubmed)

Abstract:
DNA methylations in critical regions are highly involved in cancer pathogenesis and drug response. However, to identify causal methylations out of a large number of potential polymorphic DNA methylation sites is challenging. This high-dimensional data brings two obstacles: first, many established statistical models are not scalable to so many features; second, multiple-test and overfitting become serious. To this end, a method to quickly filter candidate sites to narrow down targets for downstream analyses is urgently needed. BACkPAy is a pre-screening Bayesian approach to detect biological meaningful patterns of potential differential methylation levels with small sample size. BACkPAy prioritizes potentially important biomarkers by the Bayesian false discovery rate (FDR) approach. It filters non-informative sites (i.e., non-differential) with flat methylation pattern levels across experimental conditions. In this work, we applied BACkPAy to a genome-wide methylation dataset with three tissue types and each type contains three gastric cancer samples. We also applied LIMMA (Linear Models for Microarray and RNA-Seq Data) to compare its results with what we achieved by BACkPAy. Then, Cox proportional hazards regression models were utilized to visualize prognostics significant markers with The Cancer Genome Atlas (TCGA) data for survival analysis. Using BACkPAy, we identified eight biological meaningful patterns/groups of differential probes from the DNA methylation dataset. Using TCGA data, we also identified five prognostic genes (i.e., predictive to the progression of gastric cancer) that contain some differential methylation probes, whereas no significant results was identified using the Benjamin-Hochberg FDR in LIMMA. We showed the importance of using BACkPAy for the analysis of DNA methylation data with extremely small sample size in gastric cancer. We revealed that RDH13, CLDN11, TMTC1, UCHL1, and FOXP2 can serve as predictive biomarkers for gastric cancer treatment and the promoter methylation level of these five genes in serum could have prognostic and diagnostic functions in gastric cancer patients.
摘要:
关键区域的DNA甲基化与癌症发病机制和药物反应高度相关。然而,从大量潜在的多态性DNA甲基化位点中鉴定因果甲基化具有挑战性.这种高维数据带来了两个障碍:第一,许多已建立的统计模型无法扩展到这么多特征;第二,多重测试和过拟合变得严重。为此,迫切需要一种快速过滤候选位点以缩小下游分析目标的方法。BACkPAy是一种预筛选贝叶斯方法,用于在小样本量下检测潜在差异甲基化水平的生物学有意义的模式。BACkPAy通过贝叶斯错误发现率(FDR)方法优先考虑潜在重要的生物标志物。它过滤非信息网站(即,无差异)在整个实验条件下具有平坦的甲基化模式水平。在这项工作中,我们将BACkPAy应用于具有三种组织类型的全基因组甲基化数据集,每种类型包含三种胃癌样本.我们还应用LIMMA(微阵列和RNA-Seq数据的线性模型)将其结果与我们通过BACkPAy获得的结果进行比较。然后,利用Cox比例风险回归模型,用癌症基因组图谱(TCGA)数据可视化预后显著标志物,用于生存分析。使用BACKPAY,我们从DNA甲基化数据集中鉴定了8个有生物学意义的模式/差异探针组.使用TCGA数据,我们还确定了五个预后基因(即,预测胃癌的进展),包含一些差异甲基化探针,而在LIMMA中使用Benjamin-HochbergFDR没有发现显著结果。我们显示了使用BACkPAy分析胃癌中样本极小的DNA甲基化数据的重要性。我们发现RDH13,CLDN11,TMTC1,UCHL1和FOXP2可以作为胃癌治疗的预测生物标志物,血清中这五个基因的启动子甲基化水平可能在胃癌患者中具有预后和诊断功能。
公众号