迄今为止进行的许多全基因组关联研究(GWAS)揭示了与各种疾病相关的遗传变异,包括乳腺癌和前列腺癌.尽管有这些大规模数据,相对较少的变体已被功能表征,主要是因为大多数单核苷酸多态性(SNP)映射到人类基因组的非编码区。这些非编码变体的功能表征及其靶基因的鉴定仍然具有挑战性。
在这份通讯中,我们通过整合GWAS和乳腺癌和前列腺癌的高分辨率染色体构象捕获(Hi-C)数据,探索非编码SNP的潜在功能机制.我们表明,与缺乏物理染色质相互作用的1D线性基因组相比,更多的遗传变异通过3D基因组结构映射到调控元件。重要的是,增强剂的协会,转录因子,与简单地使用线性接近度相比,当这些调控元件通过空间相互作用映射到高风险SNP时,它们与乳腺癌和前列腺癌的靶基因往往更高。最后,我们证明,携带高风险SNP的拓扑关联域(TAD)也含有基因调控元件,其与癌症的关联性通常高于不含高风险变异体的对照TAD.
我们的结果表明,许多SNP可能通过与基因调控元件的长期染色质相互作用影响某些肿瘤相关基因的表达,从而促进癌症的发展。将大规模遗传数据集与3D基因组结构整合提供了一种有吸引力和独特的方法来系统地研究遗传变异在疾病风险和进展中的功能机制。
Numerous genome-wide association studies (GWAS) conducted to date revealed genetic variants associated with various diseases, including breast and prostate cancers. Despite the availability of these large-scale data, relatively few variants have been functionally characterized, mainly because the majority of single-nucleotide polymorphisms (SNPs) map to the non-coding regions of the human genome. The functional characterization of these non-coding variants and the identification of their target genes remain challenging.
In this communication, we explore the potential functional mechanisms of non-coding SNPs by integrating GWAS with the high-resolution chromosome conformation capture (Hi-C) data for breast and prostate cancers. We show that more genetic variants map to regulatory elements through the 3D genome structure than the 1D linear genome lacking physical chromatin interactions. Importantly, the association of enhancers, transcription factors, and their target genes with breast and prostate cancers tends to be higher when these regulatory elements are mapped to high-risk SNPs through spatial interactions compared to simply using a linear proximity. Finally, we demonstrate that topologically associating domains (TADs) carrying high-risk SNPs also contain gene regulatory elements whose association with cancer is generally higher than those belonging to control TADs containing no high-risk variants.
Our results suggest that many SNPs may contribute to the cancer development by affecting the expression of certain tumor-related genes through long-range chromatin interactions with gene regulatory elements. Integrating large-scale genetic datasets with the 3D genome structure offers an attractive and unique approach to systematically investigate the functional mechanisms of genetic variants in disease risk and progression.