Conserved sequence

保守序列
  • 文章类型: Journal Article
    一个长期存在的问题涉及Z-DNA在转录中的作用。在这里,我们使用深度学习方法DeepZ,基于DNA序列预测Z-Flipons,核苷酸和组学数据的结构特性。我们在生成全基因组Z-Flipon图后检查了人和小鼠基因组之间保守的Z-Flipon,然后通过基于Z-DNA高分辨率化学作图和转换算法Z-DNABERT的正交方法对其进行了验证。对于人类和老鼠来说,我们揭示了类似的转录因子模式,染色质重塑剂,和与保守Z-Flipons相关的组蛋白标记。我们发现,在与神经发生基因相关的替代和双向启动子中,Z-flipons显着富集。我们表明,与没有Z-Flipons的启动子相比,保守的Z-Flipons与实验确定的转录重新起始速率增加有关,但不影响伸长或暂停。我们的发现支持Z-Flipons参与转录因子E和影响表型的模型,通过激活前启动复合物的复位,和抑制染色质复合物参与时基因表达的抑制。
    A long-standing question concerns the role of Z-DNA in transcription. Here we use a deep learning approach DeepZ that predicts Z-flipons based on DNA sequence, structural properties of nucleotides and omics data. We examined Z-flipons that are conserved between human and mouse genomes after generating whole-genome Z-flipon maps and then validated them by orthogonal approaches based on high resolution chemical mapping of Z-DNA and the transformer algorithm Z-DNABERT. For human and mouse, we revealed similar pattern of transcription factors, chromatin remodelers, and histone marks associated with conserved Z-flipons. We found significant enrichment of Z-flipons in alternative and bidirectional promoters associated with neurogenesis genes. We show that conserved Z-flipons are associated with increased experimentally determined transcription reinitiation rates compared to promoters without Z-flipons, but without affecting elongation or pausing. Our findings support a model where Z-flipons engage Transcription Factor E and impact phenotype by enabling the reset of preinitiation complexes when active, and the suppression of gene expression when engaged by repressive chromatin complexes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基因调控元件驱动复杂的生物学现象,其突变与人类常见疾病有关。人类调节变体的影响通常使用模型生物如小鼠进行测试。然而,将人类增强子映射到小鼠的保守元素仍然是一个挑战,由于快速增强器进化和当前计算方法的限制。我们从DNase-seq实验的综合数据集中分析了45个匹配的人/小鼠细胞/组织对的远端增强子,并表明虽然细胞特异性调节词汇是保守的,增强子比启动子和CTCF结合位点进化得更快。增强子保存率因细胞类型而异,部分可通过组织特异性转座元件活性解释。我们提出了一种使用gap-kmer特征的改进的基因组比对算法,叫做gkm-align,并对1,401,803个直系同源调控元件进行全基因组预测。我们表明,gkm-align发现了23,660种新的人/小鼠保守增强子被以前的算法错过了,具有保守的功能活动的有力证据。
    Gene regulatory elements drive complex biological phenomena and their mutations are associated with common human diseases. The impacts of human regulatory variants are often tested using model organisms such as mice. However, mapping human enhancers to conserved elements in mice remains a challenge, due to both rapid enhancer evolution and limitations of current computational methods. We analyze distal enhancers across 45 matched human/mouse cell/tissue pairs from a comprehensive dataset of DNase-seq experiments, and show that while cell-specific regulatory vocabulary is conserved, enhancers evolve more rapidly than promoters and CTCF binding sites. Enhancer conservation rates vary across cell types, in part explainable by tissue specific transposable element activity. We present an improved genome alignment algorithm using gapped-kmer features, called gkm-align, and make genome wide predictions for 1,401,803 orthologous regulatory elements. We show that gkm-align discovers 23,660 novel human/mouse conserved enhancers missed by previous algorithms, with strong evidence of conserved functional activity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    卡波西肉瘤疱疹病毒(KSHV)ORF34作为病毒前起始复合物(vPIC)的组成部分发挥重要作用,这对于跨β-和γ-疱疹病毒的晚期基因表达是必不可少的。尽管已经认识到ORF34在vPIC中的关键作用及其作为hub蛋白的功能,需要进一步澄清其对vPIC功能的具体贡献以及与其他组件的交互。这项研究采用了ORF34的深度学习算法辅助结构模型,揭示了位于结构化域中的人类β-和γ疱疹病毒的高度保守的氨基酸残基。因此,我们通过用丙氨酸取代保守残基改造ORF34丙氨酸扫描突变体.评估这些突变体与其他vPIC因子相互作用并恢复携带ORF34缺陷型KSHV-BAC的细胞中的病毒产生的能力。我们的实验结果强调了ORF34中保守的四个半胱氨酸残基的关键作用:由一对C-Xn-C共有基序组成的四面体排列。这表明金属阳离子在与ORF24和ORF66vPIC组分相互作用中的潜在掺入,促进晚期基因转录,并通过捕获金属阳离子来促进整体病毒生产。总之,我们的发现强调了KSHVORF34中保守的半胱氨酸对于有效的vPIC组装和病毒复制的重要作用,从而增强我们对vPIC组件之间复杂相互作用的理解。
    目的:晚期基因转录的起始在β-和γ-疱疹病毒家族中普遍保守。该过程采用病毒预起始复合物(vPIC),这类似于细胞PIC。尽管KSHVORF34是病毒复制的关键因素,并且是vPIC的组成部分,vPIC形成的细节和对其功能至关重要的基本结构域仍不清楚.结构预测表明,四个保守的半胱氨酸(C170、C175、C256和C259)形成与金属阳离子配位的四面体。我们研究了这些保守氨基酸在与其他vPIC成分相互作用中的作用,晚期基因表达,和病毒生产首次证明这些半胱氨酸对于这些功能是关键的。这一发现不仅加深了我们对ORF34和vPIC动力学的全面理解,而且为今后进一步研究疱疹病毒复制机制奠定了基础。
    Kaposi\'s sarcoma herpesvirus (KSHV) ORF34 plays a significant role as a component of the viral pre-initiation complex (vPIC), which is indispensable for late gene expression across beta- and gammaherpesviruses. Although the key role of ORF34 within the vPIC and its function as a hub protein have been recognized, further clarification regarding its specific contribution to vPIC functionality and interactions with other components is required. This study employed a deep learning algorithm-assisted structural model of ORF34, revealing highly conserved amino acid residues across human beta- and gammaherpesviruses localized in structured domains. Thus, we engineered ORF34 alanine-scanning mutants by substituting conserved residues with alanine. These mutants were evaluated for their ability to interact with other vPIC factors and restore viral production in cells harboring the ORF34-deficient KSHV-BAC. Our experimental results highlight the crucial role of the four cysteine residues conserved in ORF34: a tetrahedral arrangement consisting of a pair of C-Xn-C consensus motifs. This suggests the potential incorporation of metal cations in interacting with ORF24 and ORF66 vPIC components, facilitating late gene transcription, and promoting overall virus production by capturing metal cations. In summary, our findings underline the essential role of conserved cysteines in KSHV ORF34 for effective vPIC assembly and viral replication, thereby enhancing our understanding of the complex interplay between the vPIC components.
    OBJECTIVE: The initiation of late gene transcription is universally conserved across the beta- and gammaherpesvirus families. This process employs a viral pre-initiation complex (vPIC), which is analogous to a cellular PIC. Although KSHV ORF34 is a critical factor for viral replication and is a component of the vPIC, the specifics of vPIC formation and the essential domains crucial for its function remain unclear. Structural predictions suggest that the four conserved cysteines (C170, C175, C256, and C259) form a tetrahedron that coordinates the metal cation. We investigated the role of these conserved amino acids in interactions with other vPIC components, late gene expression, and virus production to demonstrate for the first time that these cysteines are pivotal for such functions. This discovery not only deepens our comprehensive understanding of ORF34 and vPIC dynamics but also lays the groundwork for more detailed studies on herpesvirus replication mechanisms in future research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    2019年冠状病毒传染病(COVID-19),由严重急性呼吸道病毒2型(SARS-CoV-2)引起,引发了全球公共卫生危机。作为一种RNA病毒,SARS-CoV-2的高基因突变性对广谱疫苗和抗病毒疗法的开发提出了重大挑战。仍然缺乏直接靶向SARS-CoV-2的特异性治疗剂。具有以序列特异性方式有效抑制靶基因表达的能力,小干扰RNA(siRNA)治疗在抗病毒和其他疾病治疗中显示出显著的潜力。在这项工作中,我们提出了一种针对SARS-CoV-2的多个高度保守区域的高效自组装siRNA纳米颗粒。首先筛选靶向病毒保守区的siRNA序列,并通过其热力学特征进行评估。脱靶效应,和二级结构毒性。然后设计包括siRNA序列的RNA基序并自组装成siRNA纳米颗粒。这些siRNA纳米颗粒表现出显著的均匀性和稳定性,并通过细胞内吞途径有效地直接进入细胞。此外,这些纳米颗粒有效抑制SARS-CoV-2的复制,与游离siRNA相比表现出优异的抑制作用。这些结果表明,这些靶向SARS-CoV-2高度保守区域的自组装siRNA纳米颗粒代表了治疗感染的高效抗病毒候选物。并有望有效对抗当前和未来的病毒变体。
    Coronavirus infectious disease 2019 (COVID-19), caused by severe acute respiratory virus type 2 (SARS-CoV-2), has caused a global public health crisis. As an RNA virus, the high gene mutability of SARS-CoV-2 poses significant challenges to the development of broad-spectrum vaccines and antiviral therapeutics. There remains a lack of specific therapeutics directly targeting SARS-CoV-2. With the ability to efficiently inhibit the expression of target genes in a sequence-specific way, small interfering RNA (siRNA) therapy has exhibited significant potential in antiviral and other disease treatments. In this work, we presented a highly effective self-assembled siRNA nanoparticle targeting multiple highly conserved regions of SARS-CoV-2. The siRNA sequences targeting viral conserved regions were first screened and evaluated by their thermodynamic features, off-target effects, and secondary structure toxicities. RNA motifs including siRNA sequences were then designed and self-assembled into siRNA nanoparticles. These siRNA nanoparticles demonstrated remarkable uniformity and stability and efficiently entered cells directly through cellular endocytic pathways. Moreover, these nanoparticles effectively inhibited the replication of SARS-CoV-2, exhibiting a superior inhibitory effect compared to free siRNA. These results demonstrated that these self-assembled siRNA nanoparticles targeting highly conserved regions of SARS-CoV-2 represent highly effective antiviral candidates for the treatment of infections, and are promisingly effective against current and future viral variants.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Puumala原位病毒(PUUV)是欧洲和俄罗斯特有的一种新兴的人畜共患病毒,可引起肾病流行病,轻度肾综合征出血热(HFRS)。目前治疗和诊断正瘤病毒感染的选择有限,使得寻找潜在的免疫原性候选者至关重要。在目前的工作中,各种生物信息学工具被用来设计包含PUUV核衣壳蛋白多个表位的保守免疫原性肽。鉴定了PUUV核衣壳蛋白的11种保守肽(90%保守性)。使用共有表位预测算法选择含有多个T和B细胞表位的三个保守肽。使用HPEP对接服务器的分子对接证明了表位和HLA分子之间的强结合相互作用(每种I类和II类HLA的10个等位基因)。此外,使用IEDB数据库对人口覆盖率进行的分析显示,所鉴定的肽在六大洲的平均人口覆盖率超过90%。分子对接和模拟分析揭示了与所选免疫原性肽和Toll样受体-4的肽构建体的稳定相互作用。这些计算分析证明了选定的肽的免疫原性潜力,这需要在不同的实验系统中进行验证。
    Puumala orthohantavirus (PUUV) is an emerging zoonotic virus endemic to Europe and Russia that causes nephropathia epidemica, a mild form of hemorrhagic fever with renal syndrome (HFRS). There are limited options for treatment and diagnosis of orthohantavirus infection, making the search for potential immunogenic candidates crucial. In the present work, various bioinformatics tools were employed to design conserved immunogenic peptides containing multiple epitopes of PUUV nucleocapsid protein. Eleven conserved peptides (90% conservancy) of the PUUV nucleocapsid protein were identified. Three conserved peptides containing multiple T and B cell epitopes were selected using a consensus epitope prediction algorithm. Molecular docking using the HPEP dock server demonstrated strong binding interactions between the epitopes and HLA molecules (ten alleles for each class I and II HLA). Moreover, an analysis of population coverage using the IEDB database revealed that the identified peptides have over 90% average population coverage across six continents. Molecular docking and simulation analysis reveal a stable interaction with peptide constructs of chosen immunogenic peptides and Toll-like receptor-4. These computational analyses demonstrate selected peptides\' immunogenic potential, which needs to be validated in different experimental systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肠道病毒基因组复制在RNA基因组5'末端的预测RNA苜蓿叶(5'CL)开始。5'CL包含一个茎(SA)和三个茎环(SLB,SLC,SLD)。这里,我们对肠道病毒属209种人类健康相关血清型的5'CL保守性和差异进行了分析,包括肠道病毒和鼻病毒。系统发育分析表明六种不同的5'CL血清型仅与物种定义部分相关。其他发现包括5'CL序列保守性在EV物种之间高于RV物种之间,EVA和EVB的5\'CL几乎相同,RVC具有最低的5'CL守恒。在所有物种中高度保守的区域包括SA和SLB的环路和附近的碱基,这与这些位点的已知蛋白质相互作用一致。除了SLB环中Poly-C结合蛋白的已知蛋白结合位点外,SLB和SLC茎中的其他保守的连续胞嘧啶提供了尚未探索的其他潜在相互作用位点。其他保护场所,包括SLD和其他保守茎的预测凸起,循环,和交界处,更难以解释,并提出尚未完全理解的其他相互作用或结构要求。对5'CL中序列和结构保守性和变异性的更复杂的理解可能有助于开发针对多种肠道病毒的广谱抗病毒药物。同时更好地定义预期受特定抗病毒药物影响的病毒同种型的范围。
    Enterovirus genomic replication initiates at a predicted RNA cloverleaf (5\'CL) at the 5\' end of the RNA genome. The 5\'CL contains one stem (SA) and three stem-loops (SLB, SLC, SLD). Here, we present an analysis of 5\'CL conservation and divergence for 209 human health-related serotypes from the enterovirus genus, including enterovirus and rhinovirus species. Phylogenetic analysis indicates six distinct 5\'CL serotypes that only partially correlate with the species definition. Additional findings include that 5\'CL sequence conservation is higher between the EV species than between the RV species, the 5\'CL of EVA and EVB are nearly identical, and RVC has the lowest 5\'CL conservation. Regions of high conservation throughout all species include SA and the loop and nearby bases of SLB, which is consistent with known protein interactions at these sites. In addition to the known protein binding site for the Poly-C binding protein in the loop of SLB, other conserved consecutive cytosines in the stems of SLB and SLC provide additional potential interaction sites that have not yet been explored. Other sites of conservation, including the predicted bulge of SLD and other conserved stem, loop, and junction regions, are more difficult to explain and suggest additional interactions or structural requirements that are not yet fully understood. This more intricate understanding of sequence and structure conservation and variability in the 5\'CL may assist in the development of broad-spectrum antivirals against a wide range of enteroviruses, while better defining the range of virus isotypes expected to be affected by a particular antiviral.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    胞质磺基转移酶(SULTs)是2期药物代谢酶,可催化磺酸盐与内源性和异源性化合物的结合,增加它们的亲水性和从细胞中排泄。迄今为止,已鉴定出13种人类SULTs并将其分为5个家族。SULT4A1mRNA编码两种变体:(1)野生型,编码一个284个氨基酸,~33kDa蛋白,和(2)由外显子6和7之间的126bp插入片段产生的选择性剪接变体,其引入了增强无义介导的衰变的过早终止密码子。SULT4A1根据序列和结构相似性被归类为SULT,包括PAPS域,active-siteHis,和二聚化域;然而,催化口袋盖\'Loop3\'的尺寸没有保留。SULT4A1在大脑中独特表达,并位于细胞质和线粒体中。SULT4A1是高度保守的,具有罕见的内含子多态性,没有外在表现。然而,SULT4A1单倍型与Phelan-McDermid综合征和精神分裂症相关。SULT4A1敲低揭示了SULT4A1在光感受器信号传导中的潜在功能,敲除小鼠显示出神经元发育和行为受阻。小鼠和酵母模型显示,SULT4A1保护线粒体免受内源性和外源性诱导的氧化应激并刺激细胞分裂,促进树突棘的形成和突触传递。迄今为止,没有生理酶活性与SULT4A1相关。
    Cytosolic sulfotransferases (SULTs) are Phase 2 drug-metabolizing enzymes that catalyze the conjugation of sulfonate to endogenous and xenobiotic compounds, increasing their hydrophilicity and excretion from cells. To date, 13 human SULTs have been identified and classified into five families. SULT4A1 mRNA encodes two variants: (1) the wild type, encoding a 284 amino acid, ~33 kDa protein, and (2) an alternative spliced variant resulting from a 126 bp insert between exon 6 and 7, which introduces a premature stop codon that enhances nonsense-mediated decay. SULT4A1 is classified as an SULT based on sequence and structural similarities, including PAPS-domains, active-site His, and the dimerization domain; however, the catalytic pocket lid \'Loop 3\' size is not conserved. SULT4A1 is uniquely expressed in the brain and localized in the cytosol and mitochondria. SULT4A1 is highly conserved, with rare intronic polymorphisms that have no outward manifestations. However, the SULT4A1 haplotype is correlated with Phelan-McDermid syndrome and schizophrenia. SULT4A1 knockdown revealed potential SULT4A1 functions in photoreceptor signaling and knockout mice display hampered neuronal development and behavior. Mouse and yeast models revealed that SULT4A1 protects the mitochondria from endogenously and exogenously induced oxidative stress and stimulates cell division, promoting dendritic spines\' formation and synaptic transmission. To date, no physiological enzymatic activity has been associated with SULT4A1.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    超保守元素是二十年前发现的,任意定义为在人类中长度≥200bp上相同的序列,鼠标,和老鼠的基因组.该定义随后扩展到在五个哺乳动物基因组中的至少三个(包括狗和牛)中相同的序列≥100bp,并证明在鱼类中经历了祖先的快速扩张,在鸟类和哺乳动物中经历了强烈的负选择。从那以后,更多的基因组已经变得可用,可以更好地定义和更彻底地检查超保守元素的分布和进化史。我们开发了一种快速灵活的分析管道,用于识别多个基因组中的超保守元件,dedUCE,允许操纵最小长度,序列同一性,以及根据指定参数具有可检测到的超保存元素的物种数量。我们建议更新超保守元件的定义,即在≥50%的胎盘哺乳动物序列中,序列≥100bp且序列同一性≥97%(12,813超保守元件)。通过将超保守元素映射到200种,我们发现胎盘超保守元素出现在脊椎动物进化的早期,在土地殖民之前,这表明在寒武纪-泥盆纪时期的水生环境中存在驱动超保守元素选择的进化压力。大多数(>90%)超保守元素可能出现在无颚前辈的鼻孔分叉之后,通过早期的Sarcopterygii进化在很大程度上建立了序列同一性-在四足动物的叶鳍鱼类发散之前-并在羊膜中几乎固定。超保守元件主要位于参与神经和骨骼发育的蛋白质编码和非编码基因的内含子中,富含监管元素,并在整个胚胎发育过程中动态表达。
    Ultraconserved elements were discovered two decades ago, arbitrarily defined as sequences that are identical over a length ≥ 200 bp in the human, mouse, and rat genomes. The definition was subsequently extended to sequences ≥ 100 bp identical in at least three of five mammalian genomes (including dog and cow), and shown to have undergone rapid expansion from ancestors in fish and strong negative selection in birds and mammals. Since then, many more genomes have become available, allowing better definition and more thorough examination of ultraconserved element distribution and evolutionary history. We developed a fast and flexible analytical pipeline for identifying ultraconserved elements in multiple genomes, dedUCE, which allows manipulation of minimum length, sequence identity, and number of species with a detectable ultraconserved element according to specified parameters. We suggest an updated definition of ultraconserved elements as sequences ≥ 100 bp and ≥97% sequence identity in ≥50% of placental mammal orders (12,813 ultraconserved elements). By mapping ultraconserved elements to ∼200 species, we find that placental ultraconserved elements appeared early in vertebrate evolution, well before land colonization, suggesting that the evolutionary pressures driving ultraconserved element selection were present in aquatic environments in the Cambrian-Devonian periods. Most (>90%) ultraconserved elements likely appeared after the divergence of gnathostomes from jawless predecessors, were largely established in sequence identity by early Sarcopterygii evolution-before the divergence of lobe-finned fishes from tetrapods-and became near fixed in the amniotes. Ultraconserved elements are mainly located in the introns of protein-coding and noncoding genes involved in neurological and skeletomuscular development, enriched in regulatory elements, and dynamically expressed throughout embryonic development.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    磁感受生物学作为一个领域仍然相对模糊;与被认为能感知磁场的物种的广度相比,它仍然被研究不足。这里,我们提出了在硬骨鱼中扩展磁接收研究的理由。我们从玻璃体Kryptopterus的电磁感知基因(EPG)开始,并扩展到鉴定72个硬骨鱼,其同源蛋白包含保守的三苯丙氨酸(3F)基序。系统发育分析提供了有关EPG如何随时间演变的见解,并表明某些进化枝可能经历了由不同健身压力驱动的功能丧失。一个潜在的因素是淡水鱼的水型更有可能具有功能基序版本(FFF),和咸水鱼具有非功能性变体(FXF)。还揭示了,当将来自长尾囊瘤(B.g.)同源物的3F基序插入EPG-EPG(B.g.)时,反应(如细胞内钙增加所示)更快。这表明EPG有可能被设计为改善其响应并增加其用作特定结果的控制器的效用。
    Magnetoreceptive biology as a field remains relatively obscure; compared with the breadth of species believed to sense magnetic fields, it remains under-studied. Here, we present grounds for the expansion of magnetoreception studies among teleosts. We begin with the electromagnetic perceptive gene (EPG) from Kryptopterus vitreolus and expand to identify 72 teleosts with homologous proteins containing a conserved three-phenylalanine (3F) motif. Phylogenetic analysis provides insight as to how EPG may have evolved over time and indicates that certain clades may have experienced a loss of function driven by different fitness pressures. One potential factor is water type with freshwater fish significantly more likely to possess the functional motif version (FFF), and saltwater fish to have the non-functional variant (FXF). It was also revealed that when the 3F motif from the homologue of Brachyhypopomus gauderio (B.g.) is inserted into EPG-EPG(B.g.)-the response (as indicated by increased intracellular calcium) is faster. This indicates that EPG has the potential to be engineered to improve upon its response and increase its utility to be used as a controller for specific outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    拟南芥的基因表达受1,900多个转录因子(TFs)的调控,已通过存在保守的DNA结合结构域在全基因组范围内鉴定。激活剂TFs包含招募共激活剂复合物的激活域(AD);然而,对于几乎所有的拟南芥TFs,我们缺乏关于存在的知识,它们的ADs1的位置和转录强度。为了解决这个差距,在这里,我们使用酵母文库方法在蛋白质组范围内通过实验鉴定拟南芥AD,发现一半以上的拟南芥TFs含有AD。我们注释了1,553个广告,其中绝大多数是,根据我们的知识,以前未知。使用生成的数据集,我们开发了一种神经网络来准确预测AD,并识别招募共激活复合物所必需的序列特征.我们发现了导致激活活性的六种不同的序列特征组合,提供一个框架来询问AD的亚功能化。此外,我们在TFs的古代AUXIN反应因子家族中鉴定了AD,揭示AD定位在不同的进化枝中是保守的。我们的发现为理解转录激活提供了深入的资源,用于检查内在无序区域中的功能的框架和AD的预测模型。
    Gene expression in Arabidopsis is regulated by more than 1,900 transcription factors (TFs), which have been identified genome-wide by the presence of well-conserved DNA-binding domains. Activator TFs contain activation domains (ADs) that recruit coactivator complexes; however, for nearly all Arabidopsis TFs, we lack knowledge about the presence, location and transcriptional strength of their ADs1. To address this gap, here we use a yeast library approach to experimentally identify Arabidopsis ADs on a proteome-wide scale, and find that more than half of the Arabidopsis TFs contain an AD. We annotate 1,553 ADs, the vast majority of which are, to our knowledge, previously unknown. Using the dataset generated, we develop a neural network to accurately predict ADs and to identify sequence features that are necessary to recruit coactivator complexes. We uncover six distinct combinations of sequence features that result in activation activity, providing a framework to interrogate the subfunctionalization of ADs. Furthermore, we identify ADs in the ancient AUXIN RESPONSE FACTOR family of TFs, revealing that AD positioning is conserved in distinct clades. Our findings provide a deep resource for understanding transcriptional activation, a framework for examining function in intrinsically disordered regions and a predictive model of ADs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号