incomplete multi-omics

  • 文章类型: Journal Article
    肠道微生物组被认为是调节人类健康的基本决定因素之一。和多组学数据分析已越来越多地用于加强对这个复杂系统的深刻理解。然而,源于成本或其他限制,多元组学的整合往往存在观点不完整的问题,这对综合分析提出了很大的挑战。在这项工作中,提出了一种新的深度模型,称为不完全多组学变分神经网络(IMOVNN),用于不完全数据集成,疾病预测应用和生物标志物识别。受益于信息瓶颈和边际向联合配送一体化机制,IMOVNN可以学习每个个体组学的边缘潜在表示和联合潜在表示,以更好地预测疾病。此外,由于基于具体分布的特征选择层,该模型是可解释的,可以识别最相关的特征。对炎症性肠病多组学数据集的实验表明,我们的方法优于几种最新的疾病预测方法。此外,IMOVNN已从多组学数据源中识别出重要的生物标志物。
    The gut microbiome has been regarded as one of the fundamental determinants regulating human health, and multi-omics data profiling has been increasingly utilized to bolster the deep understanding of this complex system. However, stemming from cost or other constraints, the integration of multi-omics often suffers from incomplete views, which poses a great challenge for the comprehensive analysis. In this work, a novel deep model named Incomplete Multi-Omics Variational Neural Networks (IMOVNN) is proposed for incomplete data integration, disease prediction application and biomarker identification. Benefiting from the information bottleneck and the marginal-to-joint distribution integration mechanism, the IMOVNN can learn the marginal latent representation of each individual omics and the joint latent representation for better disease prediction. Moreover, owing to the feature-selective layer predicated upon the concrete distribution, the model is interpretable and can identify the most relevant features. Experiments on inflammatory bowel disease multi-omics datasets demonstrate that our method outperforms several state-of-the-art methods for disease prediction. In addition, IMOVNN has identified significant biomarkers from multi-omics data sources.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    高通量技术的可用性增加使生物医学研究人员能够跨多个组学层了解疾病病因,这显示了改善癌症亚型识别的希望。已经开发了许多计算方法来对多组数据进行聚类,然而,其中只有少数适用于部分多组学,其中一些样本在某些类型的组学中缺乏数据。在这项研究中,我们提出了一种新的基于潜在子空间学习(MCLS)的多维聚类方法,它可以处理缺少的用于聚类的多组学。我们利用具有完整组学的数据,使用基于PCA的特征提取和奇异值分解(SVD)构建潜在子空间。然后将具有不完整的多组学的数据投影到潜在子空间,并执行谱聚类以找到聚类。与几种最先进的方法相比,所提出的MCLS方法在七个不同的癌症数据集上在全部和部分病例中的三个组学水平上进行了评估。实验结果表明,在多组学数据分析中,提出的MCLS方法比比较的方法更高效和有效,为全面了解癌症及其生物学机制提供了重要参考。可用性:所提出的方法可以在https://github.com/ShangCS/MCLS自由访问。
    The increased availability of high-throughput technologies has enabled biomedical researchers to learn about disease etiology across multiple omics layers, which shows promise for improving cancer subtype identification. Many computational methods have been developed to perform clustering on multi-omics data, however, only a few of them are applicable for partial multi-omics in which some samples lack data in some types of omics. In this study, we propose a novel multi-omics clustering method based on latent sub-space learning (MCLS), which can deal with the missing multi-omics for clustering. We utilize the data with complete omics to construct a latent subspace using PCA-based feature extraction and singular value decomposition (SVD). The data with incomplete multi-omics are then projected to the latent subspace, and spectral clustering is performed to find the clusters. The proposed MCLS method is evaluated on seven different cancer datasets on three levels of omics in both full and partial cases compared to several state-of-the-art methods. The experimental results show that the proposed MCLS method is more efficient and effective than the compared methods for cancer subtype identification in multi-omics data analysis, which provides important references to a comprehensive understanding of cancer and biological mechanisms. AVAILABILITY: The proposed method can be freely accessible at https://github.com/ShangCS/MCLS.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号