关键词: disease signal gene distance genes of interest network diffusion network embedding omics data analysis

Mesh : Proteomics / methods Software Genomics / methods Computational Biology / methods Protein Interaction Maps

来  源:   DOI:10.1093/bib/bbae111   PDF(Pubmed)

Abstract:
The high-throughput genomic and proteomic scanning approaches allow investigators to measure the quantification of genome-wide genes (or gene products) for certain disease conditions, which plays an essential role in promoting the discovery of disease mechanisms. The high-throughput approaches often generate a large gene list of interest (GOIs), such as differentially expressed genes/proteins. However, researchers have to perform manual triage and validation to explore the most promising, biologically plausible linkages between the known disease genes and GOIs (disease signals) for further study. Here, to address this challenge, we proposed a network-based strategy DDK-Linker to facilitate the exploration of disease signals hidden in omics data by linking GOIs to disease knowns genes. Specifically, it reconstructed gene distances in the protein-protein interaction (PPI) network through six network methods (random walk with restart, Deepwalk, Node2Vec, LINE, HOPE, Laplacian) to discover disease signals in omics data that have shorter distances to disease genes. Furthermore, benefiting from the establishment of knowledge base we established, the abundant bioinformatics annotations were provided for each candidate disease signal. To assist in omics data interpretation and facilitate the usage, we have developed this strategy into an application that users can access through a website or download the R package. We believe DDK-Linker will accelerate the exploring of disease genes and drug targets in a variety of omics data, such as genomics, transcriptomics and proteomics data, and provide clues for complex disease mechanism and pharmacological research. DDK-Linker is freely accessible at http://ddklinker.ncpsb.org.cn/.
摘要:
高通量基因组和蛋白质组扫描方法使研究人员能够测量某些疾病的全基因组基因(或基因产物)的定量。这在促进疾病机制的发现中起着至关重要的作用。高通量方法通常会产生大量感兴趣的基因列表(GOI),如差异表达的基因/蛋白质。然而,研究人员必须进行手动分类和验证,以探索最有前途的,已知疾病基因与GOI(疾病信号)之间的生物学上合理的联系,以供进一步研究。这里,为了应对这一挑战,我们提出了一种基于网络的策略DDK-Linker,通过将GOI与疾病已知基因联系起来,促进对隐藏在组学数据中的疾病信号的探索.具体来说,它通过六种网络方法(重启随机游走,Deepwalk,Node2Vec,LINE,希望,Laplacian),以发现与疾病基因距离较短的组学数据中的疾病信号。此外,受益于我们建立的知识库的建立,为每个候选疾病信号提供了丰富的生物信息学注释。为了帮助解释组学数据并方便使用,我们已将此策略开发为用户可以通过网站访问或下载R包的应用程序。我们相信DDK-Linker将加速探索各种组学数据中的疾病基因和药物靶标,比如基因组学,转录组学和蛋白质组学数据,为复杂的疾病机制和药理研究提供线索。DDK-Linker可以在http://ddklinker上免费访问。ncpsb.org.cn/.
公众号