关键词: ENCODE GTEx allele-specific activity eQTLs functional epigenomes functional genomics genome annotations personal genome predictive models structural variants tissue specificity transformer model

Mesh : Epigenome Quantitative Trait Loci Genome-Wide Association Study Genomics Phenotype Polymorphism, Single Nucleotide

来  源:   DOI:10.1016/j.cell.2023.02.018   PDF(Pubmed)

Abstract:
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
摘要:
了解遗传变异如何影响分子表型是功能基因组学的关键目标。目前受到依赖单个单倍体参考基因组的阻碍。这里,我们提供了来自四个供体的1,635个开放获取数据集的EN-TEx资源(~30个组织×~15个测定)。数据集映射到匹配,具有长读段定相和结构变异的二倍体基因组,实例化>100万个等位基因特异性基因座的目录。这些基因座沿着单倍型表现出协调的活性,并且比相应的保守性低,非等位基因特异性的。令人惊讶的是,深度学习转换模型可以仅基于局部核苷酸序列上下文来预测等位基因特异性活性,强调对变体特别敏感的转录因子结合基序的重要性。此外,将EN-TEx与现有的基因组注释相结合,揭示了等位基因特异性和GWAS基因座之间的强关联。它还支持将已知的eQTL转移到难以描述的组织的模型(例如,从皮肤到心脏)。总的来说,EN-TEx为更准确的个人功能基因组学提供丰富的数据和可推广的模型。
公众号