pangenome

Pangenome
  • 文章类型: Journal Article
    大多数计算机模拟进化研究通常认为核心基因对细胞功能至关重要,虽然附属基因是可有可无的,特别是在营养丰富的环境中。然而,这种假设很少在pangenome背景下进行基因测试。在这项研究中,我们在营养丰富的培养基中对具有典型开放pangenome的中华根瘤菌菌株进行了适应性基因的全基因组Tn-seq分析。为了评估适应度类别分配的鲁棒性,通过三种方法分析了每个菌株三个独立突变文库的Tn-seq数据,这表明基于隐马尔可夫模型(HMM)的方法对突变库之间的变化最健壮,对数据大小不敏感,优于基于贝叶斯和蒙特卡罗模拟的方法。因此,使用HMM方法对适应度类别进行分类。健身基因,归类为必需品(ES),优势(GA),和生长的劣势(GD)基因,富含核心基因,而非必需基因(NE)在辅助基因中过度代表。辅助ES/GA基因显示出比核心ES/GA基因更低的适应度效应。共适应网络中的连通性程度按ES的顺序降低,GD,GA/NE。除了辅助基因,3284个核心基因中的1599个在测试菌株中显示出差异的重要性。在pangenome核心内,共享的准必需基因(ES和GA)和菌株依赖性适应度基因都富集在相似的功能类别中。我们的分析表明,中华根瘤菌中的共适应度连通性程度确定了相当大的模糊基本区域,并强调了共适应度网络在理解不断增加的原核全基因组数据的遗传基础方面的力量。
    Most in silico evolutionary studies commonly assumed that core genes are essential for cellular function, while accessory genes are dispensable, particularly in nutrient-rich environments. However, this assumption is seldom tested genetically within the pangenome context. In this study, we conducted a robust pangenomic Tn-seq analysis of fitness genes in a nutrient-rich medium for Sinorhizobium strains with a canonical open pangenome. To evaluate the robustness of fitness category assignment, Tn-seq data for three independent mutant libraries per strain were analyzed by three methods, which indicates that the Hidden Markov Model (HMM)-based method is most robust to variations between mutant libraries and not sensitive to data size, outperforming the Bayesian and Monte Carlo simulation-based methods. Consequently, the HMM method was used to classify the fitness category. Fitness genes, categorized as essential (ES), advantage (GA), and disadvantage (GD) genes for growth, are enriched in core genes, while nonessential genes (NE) are over-represented in accessory genes. Accessory ES/GA genes showed a lower fitness effect than core ES/GA genes. Connectivity degrees in the cofitness network decrease in the order of ES, GD, and GA/NE. In addition to accessory genes, 1599 out of 3284 core genes display differential essentiality across test strains. Within the pangenome core, both shared quasi-essential (ES and GA) and strain-dependent fitness genes are enriched in similar functional categories. Our analysis demonstrates a considerable fuzzy essential zone determined by cofitness connectivity degrees in Sinorhizobium pangenome and highlights the power of the cofitness network in understanding the genetic basis of ever-increasing prokaryotic pangenome data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肾形线虫和根结线虫是常规陆地棉最具破坏性的两种害虫,陆地棉,并继续对美国南部和中美洲半干旱地区的棉纤维生产构成重大威胁。幸运的是,在皮马棉种中已鉴定出对这些线虫的天然耐受性(G.巴巴多斯)和几个陆地棉品种(G.hirsutum),这导致了一个强大的育种计划,该计划已成功地将这些独立的抗性性状渗入并堆叠成具有优越农艺性状的几个陆地棉谱系,例如BAR32-30和BARBREN-713。这项工作通过将它们各自的基因组与易感基因组进行比较来识别这些线虫耐受性种质的基因组变异,该谱系的高质量纤维生产亲本系:Phytogen355(PSC355)。我们发现标记区域内存在几个巨大的基因组差异,这些差异包含推定的抗性基因以及两个抗性品系共有的表达机制。关于易感PSC355亲本系。这项工作强调了全基因组比较的实用性,可以通过谱系和表型阐明大小核差异。.
    Reniform and root-knot nematode are two of the most destructive pests of conventional upland cotton, Gossypium hirsutum, L. and continue to be a major threat to cotton fiber production in semi-arid regions of the southern United States and Central America. Fortunately, naturally occurring tolerance to these nematodes has been identified in the Pima cotton species (G. barbadense) and several upland cotton varieties (G. hirsutum), which has led to a robust breeding program that has successfully introgressed and stacked these independent resistant traits into several upland cotton lineages with superior agronomic traits, e.g. BAR 32-30 and BARBREN-713. This work identifies the genomic variations of these nematode tolerant accessions by comparing their respective genomes to the susceptible, high-quality fiber producing parental line of this lineage: Phytogen 355 (PSC355). We discover several large genomic differences within marker regions that harbor putative resistance genes as well as expression mechanisms shared by the two resistant lines, with respect to the susceptible PSC355 parental line. This work emphasizes the utility of whole genome comparisons as a means of elucidating large and small nuclear differences by lineage and phenotype.  .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    金黄色葡萄球菌在全球范围内引起人类的医院和社区获得性感染。由于感染的高发,金黄色葡萄球菌也是当今采样和测序最多的病原体之一,为了解细菌亚种水平的变异提供了杰出的资源。我们对83,383个公共金黄色葡萄球菌Illumina全基因组shot弹枪序列和1,263个完整基因组进行了处理和下采样,以产生7,954个代表性的子序列。平均核苷酸同一性的成对比较揭示了99.5%的天然边界,其可用于定义物种内的145个不同菌株。我们发现pangenome中的中频基因(存在于10%-95%的基因组中)可以分为与菌株背景密切相关的基因(“菌株集中”)和在菌株内高度可变的基因(“菌株扩散”)。非核心基因具有不同的染色体定位模式。值得注意的是,菌株扩散基因与原抗原相关;菌株集中的基因与vSaβ基因组岛相关,稀有基因(<10%频率)集中在复制起点附近。抗生素抗性基因在菌株扩散类中富集,而毒力基因分布在菌株之间-扩散,应变集中,核心,罕见的课程。这项研究表明,不同的基因运动模式如何帮助创建菌株作为不同的亚种实体,并提供对重要金黄色葡萄球菌功能的不同历史的见解。
    目的:我们分析了金黄色葡萄球菌的基因组多样性,一种全球流行的细菌物种,会导致人类严重感染。我们的目标是建立金黄色葡萄球菌不同菌株的基因图景,以及哪些基因可能与它们相关。我们对>84,000个基因组进行了重新处理,并进行了二次采样以消除冗余。我们发现共享其基因组>99.5%的个体样品可以被分组为菌株。我们还表明,在该物种中以中频存在的一部分基因与某些菌株密切相关,但与其他菌株完全不存在。表明在菌株特异性中的作用。这项工作为理解金黄色葡萄球菌物种的个体基因历史奠定了基础,并概述了处理大型细菌基因组数据集的策略。
    Staphylococcus aureus causes both hospital- and community-acquired infections in humans worldwide. Due to the high incidence of infection, S. aureus is also one of the most sampled and sequenced pathogens today, providing an outstanding resource to understand variation at the bacterial subspecies level. We processed and downsampled 83,383 public S. aureus Illumina whole-genome shotgun sequences and 1,263 complete genomes to produce 7,954 representative substrains. Pairwise comparison of average nucleotide identity revealed a natural boundary of 99.5% that could be used to define 145 distinct strains within the species. We found that intermediate frequency genes in the pangenome (present in 10%-95% of genomes) could be divided into those closely linked to strain background (\"strain-concentrated\") and those highly variable within strains (\"strain-diffuse\"). Non-core genes had different patterns of chromosome location. Notably, strain-diffuse genes were associated with prophages; strain-concentrated genes were associated with the vSaβ genome island and rare genes (<10% frequency) concentrated near the origin of replication. Antibiotic resistance genes were enriched in the strain-diffuse class, while virulence genes were distributed between strain-diffuse, strain-concentrated, core, and rare classes. This study shows how different patterns of gene movement help create strains as distinct subspecies entities and provide insight into the diverse histories of important S. aureus functions.
    OBJECTIVE: We analyzed the genomic diversity of Staphylococcus aureus, a globally prevalent bacterial species that causes serious infections in humans. Our goal was to build a genetic picture of the different strains of S. aureus and which genes may be associated with them. We reprocessed >84,000 genomes and subsampled to remove redundancy. We found that individual samples sharing >99.5% of their genome could be grouped into strains. We also showed that a portion of genes that are present in intermediate frequency in the species are strongly associated with some strains but completely absent from others, suggesting a role in strain specificity. This work lays the foundation for understanding individual gene histories of the S. aureus species and also outlines strategies for processing large bacterial genomic data sets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    脆弱拟杆菌是人类结肠中常见的革兰氏阴性共生细菌,分为两个基因组,称为I和II。通过全面收集694个脆弱芽孢杆菌全基因组序列,我们确定了区分这些划分的新颖特征。我们的研究揭示了一个独特的地理分布,主要在北美发现I种菌株,在亚洲发现II种菌株。此外,II类菌株更经常与血流感染相关,表明有明显的致病潜力。我们报告了与代谢相关的基因丰度的两种划分之间的差异,毒力,应激反应,和殖民战略。值得注意的是,II级菌株比I级菌株具有更多的抗菌素耐药性(AMR)基因。这些发现为I区和II区菌株的功能作用提供了新的见解,指示肠道内的特殊生态位和肠外部位的潜在致病作用。
    目的:了解肠道微生物群中微生物种类的独特功能对于破译它们对人类健康的影响至关重要。将II类菌株分类为脆弱拟杆菌可能导致错误的关联,因为研究人员可能错误地将II类菌株中观察到的特征归因于更广泛研究的IB.fragilis。我们的发现强调了将这些分裂视为具有不同功能的独立物种的必要性。我们揭示了在与肠道定植和生存策略相关的基因中,I区和II区菌株之间的差异基因患病率的新发现。潜在影响它们作为肠道共生的作用及其在肠外部位的致病性。尽管这些群体之间存在显著的生态位重叠和定殖模式,我们的研究强调了控制应变分布和行为的复杂动力学,强调需要对这些微生物有细微的了解。
    Bacteroides fragilis is a Gram-negative commensal bacterium commonly found in the human colon, which differentiates into two genomospecies termed divisions I and II. Through a comprehensive collection of 694 B. fragilis whole genome sequences, we identify novel features distinguishing these divisions. Our study reveals a distinct geographic distribution with division I strains predominantly found in North America and division II strains in Asia. Additionally, division II strains are more frequently associated with bloodstream infections, suggesting a distinct pathogenic potential. We report differences between the two divisions in gene abundance related to metabolism, virulence, stress response, and colonization strategies. Notably, division II strains harbor more antimicrobial resistance (AMR) genes than division I strains. These findings offer new insights into the functional roles of division I and II strains, indicating specialized niches within the intestine and potential pathogenic roles in extraintestinal sites.
    OBJECTIVE: Understanding the distinct functions of microbial species in the gut microbiome is crucial for deciphering their impact on human health. Classifying division II strains as Bacteroides fragilis can lead to erroneous associations, as researchers may mistakenly attribute characteristics observed in division II strains to the more extensively studied division I B. fragilis. Our findings underscore the necessity of recognizing these divisions as separate species with distinct functions. We unveil new findings of differential gene prevalence between division I and II strains in genes associated with intestinal colonization and survival strategies, potentially influencing their role as gut commensals and their pathogenicity in extraintestinal sites. Despite the significant niche overlap and colonization patterns between these groups, our study highlights the complex dynamics that govern strain distribution and behavior, emphasizing the need for a nuanced understanding of these microorganisms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:转座因子(TE)是DNA的片段,通常是几百个碱基对到几万个碱基长,有能力在基因组中产生自己的新拷贝。大多数用于在新测序的基因组中鉴定TEs的现有方法都是基于它们的重复特性,以及基于同源性和结构特征的检测。随着新的高质量组件变得越来越普遍,包括来自同一物种的多个独立组件的可用性,另一种鉴定TE家族的策略成为可能,其中我们关注由TE迁移引起的插入位点的多态性.
    结果:我们开发了使用pangenomes中发现的结构多态性来创建一个最近在物种中活跃的TE家族文库的想法,或密切相关的物种。我们提供一个工具,潘德拉,完成这项任务,并通过精心策划的图书馆说明了它在物种上的使用,和新的集会。
    结论:我们的结果表明,潘能是灵敏和准确的,倾向于正确识别具有精确边界的完整元素,特别适合检测更大的,低拷贝数TE,通常用现有的从头方法检测不到。
    BACKGROUND: Transposable Elements (TEs) are segments of DNA, typically a few hundred base pairs up to several tens of thousands bases long, that have the ability to generate new copies of themselves in the genome. Most existing methods used to identify TEs in a newly sequenced genome are based on their repetitive character, together with detection based on homology and structural features. As new high quality assemblies become more common, including the availability of multiple independent assemblies from the same species, an alternative strategy for identification of TE families becomes possible in which we focus on the polymorphism at insertion sites caused by TE mobility.
    RESULTS: We develop the idea of using the structural polymorphisms found in pangenomes to create a library of the TE families recently active in a species, or in a closely related group of species. We present a tool, pantera, that achieves this task, and illustrate its use both on species with well-curated libraries, and on new assemblies.
    CONCLUSIONS: Our results show that pantera is sensitive and accurate, tending to correctly identify complete elements with precise boundaries, and is particularly well suited to detect larger, low copy number TEs that are often undetected with existing de novo methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    乳杆菌科的菌株构成了万亿美元产业的基础。我们对其关键性状的基因组基础的理解是支离破碎的,然而,包括其工业用途的新陈代谢。公开可用的乳杆菌科基因组的Pangenome分析使我们能够为26种具有工业重要性的物种生成基因组规模的代谢网络重建。他们的手工策展导致了超过75,000个基因-蛋白质反应关联,这些关联被用于生成2,446个基因组规模的代谢模型。交叉引用基因组和已知的代谢性状允许手动代谢网络管理和代谢模型的验证。因此,我们为乳杆菌科的代谢提供了第一个全基因组基础,并收集了可用于各种实际用途的预测性计算代谢模型。重要乳杆菌科,一万亿美元产业的细菌家族,与生物可持续性倡议越来越相关。我们的研究,利用大约2400个基因组序列,提供了乳杆菌科代谢的全基因组分析,创建超过2,400个经过策划和验证的基因组规模模型(GEM)。这些GEM成功地预测了(I)独特的,物种特异性代谢反应;(ii)增加生物体适应性的生态位富集反应;(iii)基本培养基成分,提供对乳杆菌科全球氨基酸重要性的见解;和(iv)整个家族的发酵能力,基于乳杆菌科的商业产品的代谢基础上发光。这种对乳酸杆菌科代谢特性及其基因组基础的定量理解将对食品工业和生物可持续性产生深远的影响。为菌株选择和操纵提供新的见解和工具。
    Strains across the Lactobacillaceae family form the basis for a trillion-dollar industry. Our understanding of the genomic basis for their key traits is fragmented, however, including the metabolism that is foundational to their industrial uses. Pangenome analysis of publicly available Lactobacillaceae genomes allowed us to generate genome-scale metabolic network reconstructions for 26 species of industrial importance. Their manual curation led to more than 75,000 gene-protein-reaction associations that were deployed to generate 2,446 genome-scale metabolic models. Cross-referencing genomes and known metabolic traits allowed for manual metabolic network curation and validation of the metabolic models. As a result, we provide the first pangenomic basis for metabolism in the Lactobacillaceae family and a collection of predictive computational metabolic models that enable a variety of practical uses.IMPORTANCELactobacillaceae, a bacterial family foundational to a trillion-dollar industry, is increasingly relevant to biosustainability initiatives. Our study, leveraging approximately 2,400 genome sequences, provides a pangenomic analysis of Lactobacillaceae metabolism, creating over 2,400 curated and validated genome-scale models (GEMs). These GEMs successfully predict (i) unique, species-specific metabolic reactions; (ii) niche-enriched reactions that increase organism fitness; (iii) essential media components, offering insights into the global amino acid essentiality of Lactobacillaceae; and (iv) fermentation capabilities across the family, shedding light on the metabolic basis of Lactobacillaceae-based commercial products. This quantitative understanding of Lactobacillaceae metabolic properties and their genomic basis will have profound implications for the food industry and biosustainability, offering new insights and tools for strain selection and manipulation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    结核分枝杆菌(Mtb)基因组中间歇性分散的插入序列和转座酶的存在使得基因组内重组事件不可避免。了解它们对基因库(GR)的影响,这可能有助于耐药性Mtb的发展,是至关重要的。在这项研究中,临床Mtb分离株(流行区n=2,601;非流行区n=1,130)的公开WGS数据是从头组装的,过滤,脚手架成组件,和功能注释。在来自流行地区的2,601MtbWGS数据集中,2,184(耐药/敏感:1,386/798)合格为优质。我们确定了3,784个核心基因,123个软核基因,224个外壳基因,和来自流行地区的Mtb临床分离株的pangenome中的762个云基因。33和39组基因与耐药状态呈正相关和负相关(P<0.01),分别。基因本体论聚类显示,与敏感菌株相比,耐药Mtb临床分离株对噬菌体的免疫力受损,DNA修复受损。多药外排泵抑制基因(Rv3830c和Rv3855c)和CRISPR基因(Rv2816c-19c)在耐药Mtb中不存在。来自荷兰的耐药Mtb临床分离株(n=1130)的单独WGS数据分析也显示CRISPR基因(Rv2816c-17c)的缺失。这项研究强调了CRISPR基因在Mtb临床分离株耐药性发展中的作用,并有助于了解其进化轨迹和诊断开发的有用靶标。重要意义本Pan-GWAS研究比较了耐药性和药物敏感性Mtb临床分离株中的基因集的结果,揭示了编码具有基因调节以及DNA修饰和DNA修复作用的DNA结合蛋白的基因的复杂存在-缺失模式。除了具有已知功能的基因,鉴定出一些似乎在Mtb耐药性发展中具有潜在作用的未表征和假设的基因.我们已经能够推断本研究的许多发现与现有的有关耐药Mtb的分子方面的文献,进一步加强了本研究结果的相关性。
    The presence of intermittently dispersed insertion sequences and transposases in the Mycobacterium tuberculosis (Mtb) genome makes intra-genome recombination events inevitable. Understanding their effect on the gene repertoires (GR), which may contribute to the development of drug-resistant Mtb, is critical. In this study, publicly available WGS data of clinical Mtb isolates (endemic region n = 2,601; non-endemic region n = 1,130) were de novo assembled, filtered, scaffolded into assemblies, and functionally annotated. Out of 2,601 Mtb WGS data sets from endemic regions, 2,184 (drug resistant/sensitive: 1,386/798) qualified as high quality. We identified 3,784 core genes, 123 softcore genes, 224 shell genes, and 762 cloud genes in the pangenome of Mtb clinical isolates from endemic regions. Sets of 33 and 39 genes showed positive and negative associations (P < 0.01) with drug resistance status, respectively. Gene ontology clustering showed compromised immunity to phages and impaired DNA repair in drug-resistant Mtb clinical isolates compared to the sensitive ones. Multidrug efflux pump repressor genes (Rv3830c and Rv3855c) and CRISPR genes (Rv2816c-19c) were absent in the drug-resistant Mtb. A separate WGS data analysis of drug-resistant Mtb clinical isolates from the Netherlands (n = 1130) also showed the absence of CRISPR genes (Rv2816c-17c). This study highlights the role of CRISPR genes in drug resistance development in Mtb clinical isolates and helps in understanding its evolutionary trajectory and as useful targets for diagnostics development.IMPORTANCEThe results from the present Pan-GWAS study comparing gene sets in drug-resistant and drug-sensitive Mtb clinical isolates revealed intricate presence-absence patterns of genes encoding DNA-binding proteins having gene regulatory as well as DNA modification and DNA repair roles. Apart from the genes with known functions, some uncharacterized and hypothetical genes that seem to have a potential role in drug resistance development in Mtb were identified. We have been able to extrapolate many findings of the present study with the existing literature on the molecular aspects of drug-resistant Mtb, further strengthening the relevance of the results presented in this study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Duskykob(Argylesomusjaponicus)是一种商业上重要的有翼鱼,南非土著,澳大利亚,和中国。以前的研究强调了遗传组成的差异,生活史,和不同地理区域的物种形态。为南非(SA)特异性生成了0.742Gb(N50=5.49Mb;BUSCO完整性=97.8%)的基因组序列草案和22,438个预测的蛋白质编码基因。与中国(CN)同源的比较显示,两个基因组中都有32,068个直系同源蛋白质簇的核心集合。与CN基因组中的1928个独特簇相比,SA基因组显示440个独特簇。运输和免疫反应过程在SA辅助基因组中被过度代表,而CN辅助基因组被富集用于免疫反应,DNA转座,和感官检测(FDR调整的p<0.01)。这些独特的簇可能代表了物种pangenome的适应性成分,可以解释由于环境专业化差异而导致的种群差异。此外,700个单拷贝直向同源物(SCO)显示了SA和CN基因组之间的阳性选择的证据,在全球范围内,这些基因组只有92%的相似性,表明它们可能是不同的物种。这些基因主要在新陈代谢和消化中发挥作用,说明了区分物种的进化途径。了解物种内部和物种之间的适应和进化的这些基因组机制提供了对kob生长和成熟的宝贵见解。与商业水产养殖特别相关的特征。
    Dusky kob (Argyrosomus japonicus) is a commercially important finfish, indigenous to South Africa, Australia, and China. Previous studies highlighted differences in genetic composition, life history, and morphology of the species across geographic regions. A draft genome sequence of 0.742 Gb (N50 = 5.49 Mb; BUSCO completeness = 97.8%) and 22,438 predicted protein-coding genes was generated for the South African (SA) conspecific. A comparison with the Chinese (CN) conspecific revealed a core set of 32,068 orthologous protein clusters across both genomes. The SA genome exhibited 440 unique clusters compared to 1928 unique clusters in the CN genome. Transportation and immune response processes were overrepresented among the SA accessory genome, whereas the CN accessory genome was enriched for immune response, DNA transposition, and sensory detection (FDR-adjusted p < 0.01). These unique clusters may represent an adaptive component of the species\' pangenome that could explain population divergence due to differential environmental specialisation. Furthermore, 700 single-copy orthologues (SCOs) displayed evidence of positive selection between the SA and CN genomes, and globally these genomes shared only 92% similarity, suggesting they might be distinct species. These genes primarily play roles in metabolism and digestion, illustrating the evolutionary pathways that differentiate the species. Understanding these genomic mechanisms underlying adaptation and evolution within and between species provides valuable insights into growth and maturation of kob, traits that are particularly relevant to commercial aquaculture.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    评估尖孢镰刀菌的基因组多样性。lini菌株和编制一个全面的基因库,我们使用来自四个不同克隆谱系的13个分离株构建了pangenome,每个都表现出不同的毒力水平。两个选定基因组的合成分析揭示了每个基因组特有的显著染色体重排。在开放基因组状态下,对核心和附属pangenome含量和多样性点进行全面检查。此外,基因本体论(GO)富集分析表明,非核心pangenome基因与病原体识别和免疫信号有关。此外,Folini全基因组,包括对真菌致病性至关重要的分泌蛋白,主要由三个功能类别组成:效应蛋白,CAZYmes,和蛋白酶。这三个类别约占pangenome的3.5%。关于pangenome类别分布,对pansecterome中的每个功能类进行了精心注释和表征,PFAM域频率,和毒株毒力评估。该分析表明,高毒力分离株具有特定类型的PFAM结构域,这些结构域是它们独有的。在检查了其他特殊形式中已知毒力的六个基因库后,发现除了两个分离株之外,所有分离株的基因含量都相似,完全缺乏六个基因。
    To assess the genomic diversity of Fusarium oxysporum f. sp. lini strains and compile a comprehensive gene repertoire, we constructed a pangenome using 13 isolates from four different clonal lineages, each exhibiting distinct levels of virulence. Syntenic analyses of two selected genomes revealed significant chromosomal rearrangements unique to each genome. A comprehensive examination of both core and accessory pangenome content and diversity points at an open genome state. Additionally, Gene Ontology (GO) enrichment analysis indicated that non-core pangenome genes are associated with pathogen recognition and immune signaling. Furthermore, the Folini pansecterome, encompassing secreted proteins critical for fungal pathogenicity, primarily consists of three functional classes: effector proteins, CAZYmes, and proteases. These three classes account for approximately 3.5% of the pangenome. Each functional class within the pansecterome was meticulously annotated and characterized with respect to pangenome category distribution, PFAM domain frequency, and strain virulence assessment. This analysis revealed that highly virulent isolates have specific types of PFAM domains that are exclusive to them. Upon examining the repertoire of SIX genes known for virulence in other formae speciales, it was found that all isolates had a similar gene content except for two, which lacked SIX genes entirely.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    茶,全球消费最广泛的饮料之一,在其潜在的风味和与健康相关的化合物中表现出显著的基因组多样性。在这项研究中,我们介绍了一个由11个基因组组成的茶基因组的构建和分析,重点研究了三个新测序的基因组,包括紫叶assamica品种“子娟”,对温度敏感的中华品种“Anjibaicha”和野生登录号“L618”,其组合物表现出优异的质量分数,因为它们从最新的测序技术中获利。我们的分析包括对整个茶基因组的转座子补体的详细调查,揭示所研究基因组中转座子分布的共享模式,并通过长读技术提高转座子分辨率,如长终端重复(LTR)装配指数分析所示。此外,我们的研究包括以基因为中心的pangenome探索,通过我们的研究探索儿茶素途径的基因组景观,提供有关拷贝数改变和以基因为中心的变异的见解,尤其是花青素合成酶。我们通过使用相同的管道在结构和功能上注释所有可用的基因组,构建了以基因为中心的pangenome,这既增加了基因完整性,又允许高功能注释率。这种改进和一致注释的基因集将允许茶基因组之间的更好比较。我们使用这种改进的pangenome来捕获核心和可有可无的基因库,阐明茶树中存在的功能多样性。这种pangenome资源可能作为一个有价值的资源,用于理解特征的基本遗传基础,如风味,应力耐受性,和抗病性,对茶叶育种计划有影响。
    Tea, one of the most widely consumed beverages globally, exhibits remarkable genomic diversity in its underlying flavour and health-related compounds. In this study, we present the construction and analysis of a tea pangenome comprising a total of 11 genomes, with a focus on three newly sequenced genomes comprising the purple-leaved assamica cultivar \"Zijuan\", the temperature-sensitive sinensis cultivar \"Anjibaicha\" and the wild accession \"L618\" whose assemblies exhibited excellent quality scores as they profited from latest sequencing technologies. Our analysis incorporates a detailed investigation of transposon complement across the tea pangenome, revealing shared patterns of transposon distribution among the studied genomes and improved transposon resolution with long read technologies, as shown by long terminal repeat (LTR) Assembly Index analysis. Furthermore, our study encompasses a gene-centric exploration of the pangenome, exploring the genomic landscape of the catechin pathway with our study, providing insights on copy number alterations and gene-centric variants, especially for Anthocyanidin synthases. We constructed a gene-centric pangenome by structurally and functionally annotating all available genomes using an identical pipeline, which both increased gene completeness and allowed for a high functional annotation rate. This improved and consistently annotated gene set will allow for a better comparison between tea genomes. We used this improved pangenome to capture the core and dispensable gene repertoire, elucidating the functional diversity present within the tea species. This pangenome resource might serve as a valuable resource for understanding the fundamental genetic basis of traits such as flavour, stress tolerance, and disease resistance, with implications for tea breeding programmes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号