关键词: Cancer Driver genes Hierarchical weak consensus model Hypergraph Multi-omics data

来  源:   DOI:10.1007/s13755-024-00279-6   PDF(Pubmed)

Abstract:
Cancer is a complex gene mutation disease that derives from the accumulation of mutations during somatic cell evolution. With the advent of high-throughput technology, a large amount of omics data has been generated, and how to find cancer-related driver genes from a large number of omics data is a challenge. In the early stage, the researchers developed many frequency-based driver genes identification methods, but they could not identify driver genes with low mutation rates well. Afterwards, researchers developed network-based methods by fusing multi-omics data, but they rarely considered the connection among features. In this paper, after analyzing a large number of methods for integrating multi-omics data, a hierarchical weak consensus model for fusing multiple features is proposed according to the connection among features. By analyzing the connection between PPI network and co-mutation hypergraph network, this paper firstly proposes a new topological feature, called co-mutation clustering coefficient (CMCC). Then, a hierarchical weak consensus model is used to integrate CMCC, mRNA and miRNA differential expression scores, and a new driver genes identification method HWC is proposed. In this paper, the HWC method and current 7 state-of-the-art methods are compared on three types of cancers. The comparison results show that HWC has the best identification performance in statistical evaluation index, functional consistency and the partial area under ROC curve.
UNASSIGNED: The online version contains supplementary material available at 10.1007/s13755-024-00279-6.
摘要:
癌症是一种复杂的基因突变疾病,源于体细胞进化过程中突变的积累。随着高通量技术的出现,产生了大量的组学数据,如何从大量的组学数据中找到与癌症相关的驱动基因是一个挑战。在早期阶段,研究人员开发了许多基于频率的驱动基因识别方法,但是他们不能很好地识别低突变率的驱动基因。之后,研究人员通过融合多组数据开发了基于网络的方法,但是他们很少考虑特征之间的联系。在本文中,在分析了大量整合多组学数据的方法后,根据特征之间的联系,提出了融合多特征的分层弱一致性模型。通过分析PPI网络与共突变超图网络的联系,本文首先提出了一种新的拓扑特征,称为共突变聚类系数(CMCC)。然后,分层弱共识模型用于集成CMCC,mRNA和miRNA差异表达评分,提出了一种新的驾驶员基因识别方法HWC。在本文中,在三种类型的癌症中比较了HWC方法和当前的7种最新方法。比较结果表明,HWC在统计评价指标中具有最佳的识别性能,功能一致性和ROC曲线下的部分面积。
在线版本包含补充材料,可在10.1007/s13755-024-00279-6获得。
公众号