gene repertoires

  • 文章类型: Journal Article
    结核分枝杆菌(Mtb)基因组中间歇性分散的插入序列和转座酶的存在使得基因组内重组事件不可避免。了解它们对基因库(GR)的影响,这可能有助于耐药性Mtb的发展,是至关重要的。在这项研究中,临床Mtb分离株(流行区n=2,601;非流行区n=1,130)的公开WGS数据是从头组装的,过滤,脚手架成组件,和功能注释。在来自流行地区的2,601MtbWGS数据集中,2,184(耐药/敏感:1,386/798)合格为优质。我们确定了3,784个核心基因,123个软核基因,224个外壳基因,和来自流行地区的Mtb临床分离株的pangenome中的762个云基因。33和39组基因与耐药状态呈正相关和负相关(P<0.01),分别。基因本体论聚类显示,与敏感菌株相比,耐药Mtb临床分离株对噬菌体的免疫力受损,DNA修复受损。多药外排泵抑制基因(Rv3830c和Rv3855c)和CRISPR基因(Rv2816c-19c)在耐药Mtb中不存在。来自荷兰的耐药Mtb临床分离株(n=1130)的单独WGS数据分析也显示CRISPR基因(Rv2816c-17c)的缺失。这项研究强调了CRISPR基因在Mtb临床分离株耐药性发展中的作用,并有助于了解其进化轨迹和诊断开发的有用靶标。重要意义本Pan-GWAS研究比较了耐药性和药物敏感性Mtb临床分离株中的基因集的结果,揭示了编码具有基因调节以及DNA修饰和DNA修复作用的DNA结合蛋白的基因的复杂存在-缺失模式。除了具有已知功能的基因,鉴定出一些似乎在Mtb耐药性发展中具有潜在作用的未表征和假设的基因.我们已经能够推断本研究的许多发现与现有的有关耐药Mtb的分子方面的文献,进一步加强了本研究结果的相关性。
    The presence of intermittently dispersed insertion sequences and transposases in the Mycobacterium tuberculosis (Mtb) genome makes intra-genome recombination events inevitable. Understanding their effect on the gene repertoires (GR), which may contribute to the development of drug-resistant Mtb, is critical. In this study, publicly available WGS data of clinical Mtb isolates (endemic region n = 2,601; non-endemic region n = 1,130) were de novo assembled, filtered, scaffolded into assemblies, and functionally annotated. Out of 2,601 Mtb WGS data sets from endemic regions, 2,184 (drug resistant/sensitive: 1,386/798) qualified as high quality. We identified 3,784 core genes, 123 softcore genes, 224 shell genes, and 762 cloud genes in the pangenome of Mtb clinical isolates from endemic regions. Sets of 33 and 39 genes showed positive and negative associations (P < 0.01) with drug resistance status, respectively. Gene ontology clustering showed compromised immunity to phages and impaired DNA repair in drug-resistant Mtb clinical isolates compared to the sensitive ones. Multidrug efflux pump repressor genes (Rv3830c and Rv3855c) and CRISPR genes (Rv2816c-19c) were absent in the drug-resistant Mtb. A separate WGS data analysis of drug-resistant Mtb clinical isolates from the Netherlands (n = 1130) also showed the absence of CRISPR genes (Rv2816c-17c). This study highlights the role of CRISPR genes in drug resistance development in Mtb clinical isolates and helps in understanding its evolutionary trajectory and as useful targets for diagnostics development.IMPORTANCEThe results from the present Pan-GWAS study comparing gene sets in drug-resistant and drug-sensitive Mtb clinical isolates revealed intricate presence-absence patterns of genes encoding DNA-binding proteins having gene regulatory as well as DNA modification and DNA repair roles. Apart from the genes with known functions, some uncharacterized and hypothetical genes that seem to have a potential role in drug resistance development in Mtb were identified. We have been able to extrapolate many findings of the present study with the existing literature on the molecular aspects of drug-resistant Mtb, further strengthening the relevance of the results presented in this study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Arbuscular mycorrhizal (AM) fungi are ubiquitous endosymbionts of terrestrial plants. It helps plants to extract more nutrients from the soil and enhances the plant tolerance to various ecological stress factors. The AM fungal genome sequence helps to identify the gene repertoires that are crucial for adaptation to different habitat and mechanisms for interaction with host plant. The present work comprises the first draft of the genome sequence of Rhizophagus proliferus, which is an important AM species present in biofertilizer consortia for agricultural purpose. The estimated genome size of R. proliferus is ~ 110 Mbps and its genomic assembly is 94.35% complete. Genome mining was carried out to identify putative gene families important for biological functions. A total of 22,526 protein-coding genes were estimated in the genome, with an abundance of kinases and reduced number of glycoside hydrolases as compared to other fungal classes. A striking finding in the R. proliferus genome was higher number of carbohydrate esterases (CE), which may suggest towards presence of higher saprotrophic activity in this species as compared to the previously reported AM fungi, which may indicate towards its role as a link between plants and soil mineral nutrients. The genome sequence and annotation of R. proliferus presented here would serve as an important reference for functional genomics studies required for developing biofertilizer formulations in future. In addition, the findings from this work may also prove important in deciphering molecular mechanisms in AM fungi that govern the host-specific interaction and associated agriculture benefits.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    种群水平过程的知识对于了解物种内选择的有效性至关重要。然而,估计有效种群规模(Ne)的尝试在细菌中特别具有挑战性,因为它们的人口普查种群规模非常大,不同的重组率和任意的物种边界。
    在这项研究中,我们估计了在一个共同框架下定义的153个物种(152个细菌和一个古细菌)的Ne,发现生态生活方式和生长速率是Ne的主要预测因子;与理论预期相反,Ne不受重组率的影响。此外,我们发现Ne塑造了原核物种总基因库的进化和多样性。
    一起,这些结果表明了原核生物基因组结构进化的新模型,其中泛基因组大小,不是个体基因组大小,受漂移屏障演化的支配。
    Knowledge of population-level processes is essential to understanding the efficacy of selection operating within a species. However, attempts at estimating effective population sizes (Ne) are particularly challenging in bacteria due to their extremely large census populations sizes, varying rates of recombination and arbitrary species boundaries.
    In this study, we estimated Ne for 153 species (152 bacteria and one archaeon) defined under a common framework and found that ecological lifestyle and growth rate were major predictors of Ne; and that contrary to theoretical expectations, Ne was unaffected by recombination rate. Additionally, we found that Ne shapes the evolution and diversity of total gene repertoires of prokaryotic species.
    Together, these results point to a new model of genome architecture evolution in prokaryotes, in which pan-genome sizes, not individual genome sizes, are governed by drift-barrier evolution.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Comparative Study
    跨物种的基因和基因组结构的比较有可能揭示基因组进化的主要趋势。然而,这种比较方法目前受到缺乏标准化的阻碍(例如,艾略特·塔,格雷戈里TR,PhilosTransRoyalSocB:BiolSci370:20140331,2015)。例如,测试以下假设:编码序列的总量是潜在蛋白质组多样性的可靠量度(WangM,KurlandCG,Caetano-AnollésG,PNAS108:11954,2011)要求应用编码序列和基因的标准化定义,以创建具有可比性和综合性的数据集以及相应的汇总统计数据。然而,这样的标准定义要么不存在,要么不一致。这些情况要求在描述性水平上使用最少的参数以及不偏离标准术语的使用,以及在这些严格定义下推断所需数据的软件。收购一个全面的,描述性,描述性因此,基因组出版物和进一步分析的标准化参数集和汇总统计数据可以从易于使用的标准工具的可用性中大大受益。
    我们开发了一个新的开源命令行工具,COGNATE(比较基因注释表征器),它使用给定的基因组组装及其对蛋白质编码基因的注释来详细描述各自的基因和基因组结构参数。此外,我们修订了基因和基因组结构的标准定义,并提供了COGNATE使用的定义作为工作建议草案,供进一步参考.使用这组定义推断完整的参数列表和汇总统计,以允许进行下游分析并提供基因组和基因库特征的概述。COGNATE是用Perl编写的,可以在ZFMK主页(https://www.zfmk.de/en/COGNATE)和github(https://github.com/ZFMK/COGNATE)。
    工具COGNATE允许在多个水平上比较基因组组装和结构元件(例如,支架或重叠群序列,基因)。它显然增强了分析之间的可比性。因此,COGNATE可以提供基因组和基因结构参数公开的重要标准化以及数据采集,以用于未来的比较分析。随着全面描述性标准的建立和基因组的广泛可用性,一个完整的数据库将成为可能。
    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool.
    We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https://github.com/ZFMK/COGNATE ).
    The tool COGNATE allows comparing genome assemblies and structural elements on multiples levels (e.g., scaffold or contig sequence, gene). It clearly enhances comparability between analyses. Thus, COGNATE can provide the important standardization of both genome and gene structure parameter disclosure as well as data acquisition for future comparative analyses. With the establishment of comprehensive descriptive standards and the extensive availability of genomes, an encompassing database will become possible.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号