Conserved sequence

保守序列
  • 文章类型: Journal Article
    青霉素作为抗生素的最初采用标志着探索药物必需的其他化合物的开始,然而,对青霉素的抗性及其副作用已经损害了它们的功效。N末端亲核试剂(Ntn)酰胺水解酶S45家族在催化各种化合物的酰胺键水解中起关键作用,包括抗生素如青霉素和头孢菌素.本研究全面分析了细菌N-末端亲核试剂(Ntn)酰胺水解酶S45家族的结构和功能性状,涵盖青霉素G酰基酶,头孢菌素酰基转移酶,和D-琥珀酰基转移酶.利用结构生物信息学工具和序列分析,该研究描述了这些酶之间的结构保守区域(SCR)和底物结合位点变异。值得注意的是,16个对底物相互作用至关重要的SCR仅通过序列分析鉴定,强调序列数据在表征功能相关区域中的重要性。这些发现为识别靶标以增强N末端亲核试剂(Ntn)酰胺水解酶的生物催化特性引入了一种新方法,在促进开发更精确的三维模型的同时,特别是对于缺乏结构数据的酶。总的来说,这项研究促进了我们对细菌N末端亲核(Ntn)酰胺水解酶中结构-功能关系的理解,提供对优化其酶能力的策略的见解。
    The initial adoption of penicillin as an antibiotic marked the start of exploring other compounds essential for pharmaceuticals, yet resistance to penicillins and their side effects has compromised their efficacy. The N-terminal nucleophile (Ntn) amide-hydrolases S45 family plays a key role in catalyzing amide bond hydrolysis in various compounds, including antibiotics like penicillin and cephalosporin. This study comprehensively analyzes the structural and functional traits of the bacterial N-terminal nucleophile (Ntn) amide-hydrolases S45 family, covering penicillin G acylases, cephalosporin acylases, and D-succinylase. Utilizing structural bioinformatics tools and sequence analysis, the investigation delineates structurally conserved regions (SCRs) and substrate binding site variations among these enzymes. Notably, sixteen SCRs crucial for substrate interaction are identified solely through sequence analysis, emphasizing the significance of sequence data in characterizing functionally relevant regions. These findings introduce a novel approach for identifying targets to enhance the biocatalytic properties of N-terminal nucleophile (Ntn) amide-hydrolases, while facilitating the development of more accurate three-dimensional models, particularly for enzymes lacking structural data. Overall, this research advances our understanding of structure-function relationships in bacterial N-terminal nucleophile (Ntn) amide-hydrolases, providing insights into strategies for optimizing their enzymatic capabilities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在整个胚胎发育过程中,胚胎功能和形态特征的形成是由转录因子和顺式调控元件之间复杂的相互作用所协调的。在这项研究中,我们在原肠胚形成过程中对子宫内膜顺式调节景观进行了全面分析,专注于四个典型物种:棘皮动物,头索状Branchiostomalaneolatum,泌尿肽Ciona肠,和脊椎动物Daniorerio.我们的方法涉及对ATAC-seq数据集的比较计算分析,以探索作为胃泌素基础的保守转录因子结合基序的全基因组蓝图。我们确定了一组与62个已知转录因子相关的保守DNA结合基序,表明整个子宫造口的原肠胚调节景观具有显着的保护作用。我们的发现为胚胎发育的进化分子动力学提供了有价值的见解,阐明了保守的调控子程序,并提供了有关原肠胚形成过程中基因调控的保守和分歧的全面观点。
    Throughout embryonic development, the shaping of the functional and morphological characteristics of embryos is orchestrated by an intricate interaction between transcription factors and cis-regulatory elements. In this study, we conducted a comprehensive analysis of deuterostome cis-regulatory landscapes during gastrulation, focusing on four paradigmatic species: the echinoderm Strongylocentrotus purpuratus, the cephalochordate Branchiostoma lanceolatum, the urochordate Ciona intestinalis, and the vertebrate Danio rerio. Our approach involved comparative computational analysis of ATAC-seq datasets to explore the genome-wide blueprint of conserved transcription factor binding motifs underlying gastrulation. We identified a core set of conserved DNA binding motifs associated with 62 known transcription factors, indicating the remarkable conservation of the gastrulation regulatory landscape across deuterostomes. Our findings offer valuable insights into the evolutionary molecular dynamics of embryonic development, shedding light on conserved regulatory subprograms and providing a comprehensive perspective on the conservation and divergence of gene regulation underlying the gastrulation process.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目前,在CAZy中,主要的α-淀粉酶家族GH13已分为47个亚家族,新的亚家族定期出现。目前的计算机模拟研究是为了突出这些群体,由来自Thermotoganeapolitana的麦芽糖淀粉酶和来自Haloarculajaponica的α-淀粉酶代表,这是值得创建自己的新GH13亚家族。这扩大了功能注释,因此可以更精确地预测推定蛋白质的功能。有趣的是,这两个共享某些序列特征,例如,在催化亲核试剂正前方的第二个保守序列区(CSR-II)中的高度保守的半胱氨酸,或CSR-VII结尾保存完好的GQ特征。另一方面,这两个群体还具有特定和高度保守的立场,这些立场不仅将它们彼此区分开来,而且将它们与迄今建立的其余GH13亚家族的代表区分开来。对于产麦芽淀粉酶组,它是CSR-V末端高度保守为L-[DN]的一段残基。H.japonicaα-淀粉酶组的特征在于CSR-II末端的高度保守的[WY]-[GA]序列。其他特定序列特征包括位于CSR-III中的一般酸/碱正前方的几乎完全保守的天冬氨酸或CSR-IV中保存良好的谷氨酸。假设这两组代表两个相互关联的,但是同时,系统发育分析以及三级结构的比较支持了独立的GH13亚家族。因此,主要的α-淀粉酶家族GH13被两个新的亚家族GH13_48和GH13_49扩展。关键要点:•对具有表征代表的两组GH13家族成员进行计算机模拟分析•确定某些共同的,还有七个CSR中的一些特定序列特征•在CAZy数据库中创建两个新的亚家族-GH13_48和GH13_49。
    Currently, the main α-amylase family GH13 has been divided into 47 subfamilies in CAZy, with new subfamilies regularly emerging. The present in silico study was performed to highlight the groups, represented by the maltogenic amylase from Thermotoga neapolitana and the α-amylase from Haloarcula japonica, which are worth of creating their own new GH13 subfamilies. This enlarges functional annotation and thus allows more precise prediction of the function of putative proteins. Interestingly, those two share certain sequence features, e.g. the highly conserved cysteine in the second conserved sequence region (CSR-II) directly preceding the catalytic nucleophile, or the well-preserved GQ character of the end of CSR-VII. On the other hand, the two groups bear also specific and highly conserved positions that distinguish them not only from each other but also from representatives of remaining GH13 subfamilies established so far. For the T. neapolitana maltogenic amylase group, it is the stretch of residues at the end of CSR-V highly conserved as L-[DN]. The H. japonica α-amylase group can be characterized by a highly conserved [WY]-[GA] sequence at the end of CSR-II. Other specific sequence features include an almost fully conserved aspartic acid located directly preceding the general acid/base in CSR-III or well-preserved glutamic acid in CSR-IV. The assumption that these two groups represent two mutually related, but simultaneously independent GH13 subfamilies has been supported by phylogenetic analysis as well as by comparison of tertiary structures. The main α-amylase family GH13 has thus been expanded by two novel subfamilies GH13_48 and GH13_49. KEY POINTS: • In silico analysis of two groups of family GH13 members with characterized representatives • Identification of certain common, but also some specific sequence features in seven CSRs • Creation of two novel subfamilies-GH13_48 and GH13_49 within the CAZy database.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    丙型肝炎病毒(HCV)是一种正链RNA病毒,通常会慢性感染肝肝细胞并导致肝硬化和癌症。这些病毒使用易错复制酶复制它们的基因组。因此,他们通常会产生大量的RNA基因组(准种),通过反复试验,全面探索可用于功能性RNA基因组的序列空间,从而保持有效复制和免疫逃逸的能力。在这种情况下,确定HCV基因组序列空间中哪些RNA二级结构是保守的,可能是由于功能要求。这里,我们提供了第一个全基因组多序列比对(MSA),并预测了所有代表性全长HCV基因组中的RNA二级结构。我们通过基于k-mer分布和降维并添加RefSeq序列对来自BV-BRC数据库的所有完整HCV基因组进行聚类来选择57个代表性基因组。我们包括以前公认的特征的注释,以便与其他研究进行比较。我们的结果表明,主要是核心编码区,C端NS5A区域,并且NS5B区域包含超出编码序列要求而保守的二级结构元件,在RNA水平上显示功能。相比之下,之间的基因组区域包含不太高度保守的结构。结果提供了所有保守的RNA二级结构的完整描述,并且清楚地表明功能上重要的RNA二级结构存在于某些HCV基因组区域中,但在其他区域中大部分不存在。补充中提供了肝病毒C的所有分支的全基因组比对。
    Hepatitis C virus (HCV) is a plus-stranded RNA virus that often chronically infects liver hepatocytes and causes liver cirrhosis and cancer. These viruses replicate their genomes employing error-prone replicases. Thereby, they routinely generate a large \'cloud\' of RNA genomes (quasispecies) which-by trial and error-comprehensively explore the sequence space available for functional RNA genomes that maintain the ability for efficient replication and immune escape. In this context, it is important to identify which RNA secondary structures in the sequence space of the HCV genome are conserved, likely due to functional requirements. Here, we provide the first genome-wide multiple sequence alignment (MSA) with the prediction of RNA secondary structures throughout all representative full-length HCV genomes. We selected 57 representative genomes by clustering all complete HCV genomes from the BV-BRC database based on k-mer distributions and dimension reduction and adding RefSeq sequences. We include annotations of previously recognized features for easy comparison to other studies. Our results indicate that mainly the core coding region, the C-terminal NS5A region, and the NS5B region contain secondary structure elements that are conserved beyond coding sequence requirements, indicating functionality on the RNA level. In contrast, the genome regions in between contain less highly conserved structures. The results provide a complete description of all conserved RNA secondary structures and make clear that functionally important RNA secondary structures are present in certain HCV genome regions but are largely absent from other regions. Full-genome alignments of all branches of Hepacivirus C are provided in the supplement.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    严重急性呼吸道综合症冠状病毒2(SARS-CoV-2)由于新的变异株的不断出现,延长了大流行的持续时间。这些突变株的出现使得用现有抗体检测病毒变得困难;因此,开发能够同时靶向变体和原始菌株的新型抗体是必要的。在这项研究中,我们产生了针对SARS-CoV-2刺突蛋白高度保守区的高亲和力单克隆抗体(5G2),以检测蛋白变体.此外,我们产生了它的单链可变抗体片段(sc5G2)。在哺乳动物和细菌细胞中表达的sc5G2检测到原始SARS-CoV-2和变体菌株的刺突蛋白。得到的sc5G2将是检测原始SARS-CoV-2和变异株的有用工具。
    Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has prolonged the duration of the pandemic because of the continuous emergence of new variant strains. The emergence of these mutant strains makes it difficult to detect the virus with the existing antibodies; thus, the development of novel antibodies that can target both the variants as well as the original strain is necessary. In this study, we generated a high-affinity monoclonal antibody (5G2) against the highly conserved region of the SARS-CoV-2 spike protein to detect the protein variants. Moreover, we generated its single-chain variable antibody fragment (sc5G2). The sc5G2 expressed in mammalian and bacterial cells detected the spike protein of the original SARS-CoV-2 and variant strains. The resulting sc5G2 will be a useful tool to detect the original SARS-CoV-2 and variant strains.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    酶在各种工业生产和药物开发中起着至关重要的作用,作为众多生化反应的催化剂。确定酶的最佳催化温度(Topt)对于优化反应条件至关重要。提高催化效率,加快工业进程。然而,由于实验确定的Topt数据的可用性有限,以及现有计算方法在预测Topt时的准确性不足,迫切需要一种计算方法来准确预测酶的Topt值。在这项研究中,使用磷酸酶(EC3.1.3。X)作为一个例子,我们构建了一个机器学习模型,利用氨基酸频率和蛋白质分子量信息作为特征,并采用K-最近邻回归算法预测酶的Topt.通常,在进行酶热稳定性工程时,研究人员倾向于不修饰保守的氨基酸。因此,我们利用这个机器学习模型来预测去除保守氨基酸后磷酸酶序列的Topt。我们发现,与基于完整序列的模型相比,预测模型的平均决定系数(R2)值从0.599增加到0.755。随后,对10种磷酸酶的最佳催化温度未确定的实验验证表明,大多数磷酸酶基于不含保守氨基酸的序列的预测值更接近实验最佳催化温度值。本研究为快速筛选适合工业条件的酶奠定了基础。
    Enzymes play a crucial role in various industrial production and pharmaceutical developments, serving as catalysts for numerous biochemical reactions. Determining the optimal catalytic temperature (Topt) of enzymes is crucial for optimizing reaction conditions, enhancing catalytic efficiency, and accelerating the industrial processes. However, due to the limited availability of experimentally determined Topt data and the insufficient accuracy of existing computational methods in predicting Topt, there is an urgent need for a computational approach to predict the Topt values of enzymes accurately. In this study, using phosphatase (EC 3.1.3.X) as an example, we constructed a machine learning model utilizing amino acid frequency and protein molecular weight information as features and employing the K-nearest neighbors regression algorithm to predict the Topt of enzymes. Usually, when conducting engineering for enzyme thermostability, researchers tend not to modify conserved amino acids. Therefore, we utilized this machine learning model to predict the Topt of phosphatase sequences after removing conserved amino acids. We found that the predictive model\'s mean coefficient of determination (R2) value increased from 0.599 to 0.755 compared to the model based on the complete sequences. Subsequently, experimental validation on 10 phosphatase enzymes with undetermined optimal catalytic temperatures shows that the predicted values of most phosphatase enzymes based on the sequence without conservative amino acids are closer to the experimental optimal catalytic temperature values. This study lays the foundation for the rapid selection of enzymes suitable for industrial conditions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    面对SARS-CoV-2大流行,以病毒的快速突变率为特征,制定及时和有针对性的治疗和诊断干预措施是一个重大挑战。这项研究利用生物信息学分析来确定SARS-CoV-2中的保守基因组区域,在对抗这种病原体和未来病原体方面提供了战略优势。我们的方法使得能够创建不仅快速,可靠,并且具有成本效益,但也具有显着的能力,以无与伦比的精度检测广泛的当前和未来的变体。我们发现的意义在于证明,关注这些保守的基因组序列可以显着增强我们对新出现的传染病的准备和反应。通过为多功能诊断工具和疗法的开发提供蓝图,这项研究为更有效的全球流行病应对策略铺平了道路。
    In the face of the SARS-CoV-2 pandemic, characterized by the virus\'s rapid mutation rates, developing timely and targeted therapeutic and diagnostic interventions presents a significant challenge. This study utilizes bioinformatic analyses to pinpoint conserved genomic regions within SARS-CoV-2, offering a strategic advantage in the fight against this and future pathogens. Our approach has enabled the creation of a diagnostic assay that is not only rapid, reliable, and cost-effective but also possesses a remarkable capacity to detect a wide array of current and prospective variants with unmatched precision. The significance of our findings lies in the demonstration that focusing on these conserved genomic sequences can significantly enhance our preparedness for and response to emerging infectious diseases. By providing a blueprint for the development of versatile diagnostic tools and therapeutics, this research paves the way for a more effective global pandemic response strategy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    艰难梭菌毒素TcdB(2,366个氨基酸)亚型之间的序列差异广泛分布在整个蛋白质中,值得注意的是,蛋白质羧基末端有76个残基。在TcdB变体中,该序列不变区(SIR)在DNA和蛋白质水平上是相同的,这表明这串氨基酸已经经历了选择性压力以防止改变。尚未确定SIR域在TcdB中的功能作用。缺乏SIR结构域的重组构建的TcdB突变体的分析未发现TcdB的酶或细胞病变活性的变化。为了进一步评估SIR区域,我们构建了一个艰难梭菌菌株,从tcdB基因中删除了最后的228bp,导致产生缺少SIR的TcdB的截断形式(TcdB2Δ2291-2366)。使用多种方法的组合,我们发现,在没有SIR序列的情况下,TcdB2÷2291-2366保留了细胞毒性活性,但不从艰难梭菌分泌。在自溶条件下,TcdB2Δ2291-2366未从细胞中释放,表明SIR参与毒素从细菌逃逸的更离散的步骤。分级分离实验结合抗体检测发现TcdB2Δ2291-2366在细胞膜上积累,但无法完成超过该点的分泌步骤。这些数据表明TcdB变体之间的SIR结构域的保守性可能受到序列在毒素从艰难梭菌有效逃逸中的作用的影响。
    目的:艰难梭菌是美国抗生素相关疾病的主要原因。艰难梭菌产生的主要毒力因子是两种大的糖基化毒素TcdA和TcdB。迄今为止,已鉴定出TcdB的几种序列变体,它们在各种功能特性上有所不同。这里,我们在TcdB亚型中发现了一个高度保守的区域,该区域是艰难梭菌释放毒素所必需的。这项研究揭示了TcdB亚型中不变序列的最长延伸的推定作用,并提供了有关毒素释放到细胞外环境中的新细节。提高我们对TcdB变体保守区域的功能作用的理解有助于开发新的,广泛适用的治疗CDI的策略。
    Sequence differences among the subtypes of Clostridioides difficile toxin TcdB (2,366 amino acids) are broadly distributed across the entire protein, with the notable exception of 76 residues at the protein\'s carboxy terminus. This sequence invariable region (SIR) is identical at the DNA and protein level among the TcdB variants, suggesting this string of amino acids has undergone selective pressure to prevent alterations. The functional role of the SIR domain in TcdB has not been determined. Analysis of a recombinantly constructed TcdB mutant lacking the SIR domain did not identify changes in TcdB\'s enzymatic or cytopathic activities. To further assess the SIR region, we constructed a C. difficile strain with the final 228 bp deleted from the tcdB gene, resulting in the production of a truncated form of TcdB lacking the SIR (TcdB2∆2291-2366). Using a combination of approaches, we found in the absence of the SIR sequence TcdB2∆2291-2366 retained cytotoxic activity but was not secreted from C. difficile. TcdB2∆2291-2366 was not released from the cell under autolytic conditions, indicating the SIR is involved in a more discrete step in toxin escape from the bacterium. Fractionation experiments combined with antibody detection found that TcdB2∆2291-2366 accumulates at the cell membrane but is unable to complete steps in secretion beyond this point. These data suggest conservation of the SIR domain across variants of TcdB could be influenced by the sequence\'s role in efficient escape of the toxin from C. difficile.
    OBJECTIVE: Clostridioides difficile is a leading cause of antibiotic associated disease in the United States. The primary virulence factors produced by C. difficile are two large glucosylating toxins TcdA and TcdB. To date, several sequence variants of TcdB have been identified that differ in various functional properties. Here, we identified a highly conserved region among TcdB subtypes that is required for release of the toxin from C. difficile. This study reveals a putative role for the longest stretch of invariable sequence among TcdB subtypes and provides new details regarding toxin release into the extracellular environment. Improving our understanding of the functional roles of the conserved regions of TcdB variants aids in the development of new, broadly applicable strategies to treat CDI.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    十二指肠贾第虫,水传播感染的主要原因,感染范围广泛的哺乳动物宿主,并细分为八个基因定义明确的组合,命名为A至H。然而,片段化的基因组和缺乏组合内部和之间的比较分析使得控制宿主特异性和差异疾病结果的分子机制不清楚。为了解决这个问题,我们使用OxfordNanopore平台通过测序Be-2基因组,产生了接近完整的AI组合的从头基因组。我们生成了148,144个长读段,质量评分>7。最终的基因组组装仅由9个重叠群组成,N50为3,045,186bp。该组件与AI组件(WB-C6)中另一个应变的组件非常吻合。然而,一个关键的区别是,以前放置在Chr5的五素数区域中的区域属于Be-2的Chr4。我们发现倍性高度保守,纯合性,以及AI组合中富含半胱氨酸的变体特异性表面蛋白(VSP)的存在。我们的组装提供了一个几乎完整的基因组,帮助能够阐明贾第虫传播的人群基因组研究,主机范围,和致病性。
    Giardia duodenalis, a major cause of waterborne infection, infects a wide range of mammalian hosts and is subdivided into eight genetically well-defined assemblages named A through H. However, fragmented genomes and a lack of comparative analysis within and between the assemblages render unclear the molecular mechanisms controlling host specificity and differential disease outcomes. To address this, we generated a near-complete de novo genome of AI assemblage using the Oxford Nanopore platform by sequencing the Be-2 genome. We generated 148,144 long-reads with quality scores of > 7. The final genome assembly consists of only nine contigs with an N50 of 3,045,186 bp. This assembly agrees closely with the assembly of another strain in the AI assemblage (WB-C6). However, a critical difference is that a region previously placed in the five-prime region of Chr5 belongs to Chr4 of Be-2. We find a high degree of conservation in the ploidy, homozygosity, and the presence of cysteine-rich variant-specific surface proteins (VSPs) within the AI assemblage. Our assembly provides a nearly complete genome of a member of the AI assemblage of G. duodenalis, aiding population genomic studies capable of elucidating Giardia transmission, host range, and pathogenicity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    组蛋白变体是替代核小体中的经典组蛋白的旁系同源物,经常赋予新的功能。然而,组蛋白变体是如何产生和进化的,人们知之甚少。由于基因谱系和位点之间的进化率差异很大,组蛋白进化的重建具有挑战性。在这里,我们使用来自108个线虫基因组的内含子位置数据与氨基酸序列数据相结合,以找到在秀丽隐杆线虫中发现的三种H2A变体的不同进化史:古代H2A。ZHTZ-1,精子特异性HTAS-1和HIS-35,其与规范的S期H2A的不同之处在于单个甘氨酸到丙氨酸的C末端变化。虽然H2A。ZHTZ-1蛋白序列高度保守,它的基因表现出反复的内含子得失。这种模式表明特定的内含子序列或位置对H2A可能不重要。Z功能。对于HTAS-1和HIS-35,我们发现跨物种保守的变体特异性内含子位置。内含子位置保守的模式表明,精子特异性变异HTAS-1最近出现在一部分秀丽隐杆线虫的祖先中,而HIS-35出现在秀丽隐杆线虫的祖先和它的姐妹群,包括Diploscapter属。HIS-35在一些后代谱系中表现出基因保留,但在另一些谱系中表现出基因丢失,表明组蛋白变体的使用或功能可以是高度灵活的。令人惊讶的是,我们发现将HIS-35与核心H2A区分开的单个氨基酸是祖先的,并且在经典的秀丽隐杆线虫H2A序列中很常见。因此,我们推测HIS-35的作用不在于编码功能不同的蛋白质,而是在整个细胞周期或不同组织中实现H2A表达。这项工作说明了编码这种部分冗余功能的基因如何在进化时间尺度上是有利的,但相对可替换的。与这两个基因的保留和丢失的拼凑模式一致。我们的研究表明内含子位置在重建基因家族进化史中的实用性,特别是那些经历特殊序列进化的人。
    Histone variants are paralogs that replace canonical histones in nucleosomes, often imparting novel functions. However, how histone variants arise and evolve is poorly understood. Reconstruction of histone protein evolution is challenging due to large differences in evolutionary rates across gene lineages and sites. Here we used intron position data from 108 nematode genomes in combination with amino acid sequence data to find disparate evolutionary histories of the three H2A variants found in Caenorhabditis elegans: the ancient H2A.ZHTZ-1, the sperm-specific HTAS-1, and HIS-35, which differs from the canonical S-phase H2A by a single glycine-to-alanine C-terminal change. Although the H2A.ZHTZ-1 protein sequence is highly conserved, its gene exhibits recurrent intron gain and loss. This pattern suggests that specific intron sequences or positions may not be important to H2A.Z functionality. For HTAS-1 and HIS-35, we find variant-specific intron positions that are conserved across species. Patterns of intron position conservation indicate that the sperm-specific variant HTAS-1 arose more recently in the ancestor of a subset of Caenorhabditis species, while HIS-35 arose in the ancestor of Caenorhabditis and its sister group, including the genus Diploscapter. HIS-35 exhibits gene retention in some descendent lineages but gene loss in others, suggesting that histone variant use or functionality can be highly flexible. Surprisingly, we find the single amino acid differentiating HIS-35 from core H2A is ancestral and common across canonical Caenorhabditis H2A sequences. Thus, we speculate that the role of HIS-35 lies not in encoding a functionally distinct protein, but instead in enabling H2A expression across the cell cycle or in distinct tissues. This work illustrates how genes encoding such partially-redundant functions may be advantageous yet relatively replaceable over evolutionary timescales, consistent with the patchwork pattern of retention and loss of both genes. Our study shows the utility of intron positions for reconstructing evolutionary histories of gene families, particularly those undergoing idiosyncratic sequence evolution.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号