关键词: GO enrichment Gene co-expression networks Gene modules WGCNA

Mesh : Algorithms Cluster Analysis Gene Regulatory Networks / genetics Gene Expression Profiling / methods Computational Biology / methods Humans Gene Ontology Multigene Family Databases, Genetic

来  源:   DOI:10.1186/s12859-024-05848-w   PDF(Pubmed)

Abstract:
BACKGROUND: A widely used approach for extracting information from gene expression data employs the construction of a gene co-expression network and the subsequent computational detection of gene clusters, called modules. WGCNA and related methods are the de facto standard for module detection. The purpose of this work is to investigate the applicability of more sophisticated algorithms toward the design of an alternative method with enhanced potential for extracting biologically meaningful modules.
RESULTS: We present self-learning gene clustering pipeline (SGCP), a spectral method for detecting modules in gene co-expression networks. SGCP incorporates multiple features that differentiate it from previous work, including a novel step that leverages gene ontology (GO) information in a self-leaning step. Compared with widely used existing frameworks on 12 real gene expression datasets, we show that SGCP yields modules with higher GO enrichment. Moreover, SGCP assigns highest statistical importance to GO terms that are mostly different from those reported by the baselines.
CONCLUSIONS: Existing frameworks for discovering clusters of genes in gene co-expression networks are based on relatively simple algorithmic components. SGCP relies on newer algorithmic techniques that enable the computation of highly enriched modules with distinctive characteristics, thus contributing a novel alternative tool for gene co-expression analysis.
摘要:
背景:从基因表达数据中提取信息的一种广泛使用的方法是构建基因共表达网络和随后的基因簇计算检测,称为模块。WGCNA和相关方法是模块检测的事实上的标准。这项工作的目的是研究更复杂的算法对设计一种替代方法的适用性,该方法具有增强的提取生物学有意义的模块的潜力。
结果:我们介绍了自学习基因聚类管道(SGCP),用于检测基因共表达网络中的模块的光谱方法。SGCP包含多个功能,使其与以前的工作不同,包括在自我学习步骤中利用基因本体论(GO)信息的新步骤。与在12个真实基因表达数据集上广泛使用的现有框架相比,我们表明SGCP产生具有较高GO富集的模块。此外,SGCP对与基线报告的术语大不相同的GO术语赋予最高的统计重要性。
结论:在基因共表达网络中发现基因簇的现有框架是基于相对简单的算法组件。SGCP依赖于更新的算法技术,使高度丰富的模块具有独特的特点的计算,从而为基因共表达分析提供了一种新的替代工具。
公众号