转录活性的模式是通过调控元件编码在我们的基因组中,如启动子或增强子,矛盾的是,含有相似的序列特异性转录因子(TF)结合位点1-3。了解这些序列基序如何编码多个,经常重叠,基因表达程序对于理解基因调控以及非编码DNA中的突变如何在疾病中表现至关重要4,5.这里,通过从个体转录起始位点(TSSs)的角度研究基因调控,利用自然遗传变异,内源性TF蛋白水平的扰动和天然和合成调控元件的大规模平行分析,我们表明TF结合对转录起始的影响是位置依赖性的。分析TF结合位点相对于TSS的出现,我们确定了几个具有高度优先定位的主题。我们证明这些模式是TF的不同功能概况的组合-许多TF,包括标准激活剂如NRF1、NFY和Sp1,根据它们相对于TSS的精确位置来激活或抑制转录起始。因此,TF及其间距共同指导转录起始的位点和频率。更广泛地说,这些发现揭示了TF结合位点的相似分类如何根据其空间构型产生不同的基因调控结果,以及DNA序列多态性如何导致转录变异和疾病,并强调了TSS数据在我们基因组调控信息解码中的关键作用.
Patterns of transcriptional activity are encoded in our genome through regulatory elements such as promoters or enhancers that, paradoxically, contain similar assortments of sequence-specific transcription factor (TF) binding sites1-3. Knowledge of how these sequence motifs encode multiple, often overlapping, gene expression programs is central to understanding gene regulation and how mutations in non-coding DNA manifest in disease4,5. Here, by studying gene regulation from the perspective of individual transcription start sites (TSSs), using natural genetic variation, perturbation of endogenous TF protein levels and massively parallel analysis of natural and synthetic regulatory elements, we show that the effect of TF binding on transcription initiation is position dependent. Analysing TF-binding-site occurrences relative to the TSS, we identified several motifs with highly preferential positioning. We show that these patterns are a combination of a TF\'s distinct functional profiles-many TFs, including canonical activators such as NRF1, NFY and Sp1, activate or repress transcription initiation depending on their precise position relative to the TSS. As such, TFs and their spacing collectively guide the site and frequency of transcription initiation. More broadly, these findings reveal how similar assortments of TF binding sites can generate distinct gene regulatory outcomes depending on their spatial configuration and how DNA sequence polymorphisms may contribute to transcription variation and disease and underscore a critical role for TSS data in decoding the regulatory information of our genome.