关键词: UK Biobank network diffusion random walk with restart (RWR) rare variants whole exome sequencing (WES)

来  源:   DOI:10.1002/gepi.22557

Abstract:
Genome-wide association studies (GWAS) have provided an abundance of information about the genetic variants and their loci that are associated to complex traits and diseases. However, due to linkage disequilibrium (LD) and noncoding regions of loci, it remains a challenge to pinpoint the causal genes. Gene network-based approaches, paired with network diffusion methods, have been proposed to prioritize causal genes and to boost statistical power in GWAS based on the assumption that trait-associated genes are clustered in a gene network. Due to the difficulty in mapping trait-associated variants to genes in GWAS, this assumption has never been directly or rigorously tested empirically. On the other hand, whole exome sequencing (WES) data focuses on the protein-coding regions, directly identifying trait-associated genes. In this study, we tested the assumption by leveraging the recently available exome-based association statistics from the UK Biobank WES data along with two types of networks. We found that almost all trait-associated genes were significantly more proximal to each other than randomly selected genes within both networks. These results support the assumption that trait-associated genes are clustered in gene networks, which can be further leveraged to boost the power of GWAS such as by introducing less stringent p value thresholds.
摘要:
全基因组关联研究(GWAS)提供了有关与复杂性状和疾病相关的遗传变异及其基因座的大量信息。然而,由于连锁不平衡(LD)和基因座的非编码区,查明因果基因仍然是一个挑战。基于基因网络的方法,与网络扩散方法配对,基于性状相关基因聚集在基因网络中的假设,已经提出优先考虑因果基因并提高GWAS中的统计能力。由于难以将性状相关变体映射到GWAS中的基因,这一假设从未经过直接或严格的实证检验。另一方面,全外显子组测序(WES)数据集中在蛋白质编码区,直接识别性状相关基因。在这项研究中,我们通过利用英国生物银行WES数据中最近获得的基于外显子组的关联统计数据以及两种类型的网络来检验这一假设.我们发现,几乎所有与性状相关的基因都比两个网络中随机选择的基因更接近。这些结果支持性状相关基因聚集在基因网络中的假设,可以进一步利用它来提高GWAS的能力,例如通过引入不太严格的p值阈值。
公众号