bacterial genomics

细菌基因组学
  • 文章类型: Journal Article
    NaHCO3反应性是一种新的表型,其中一些耐甲氧西林金黄色葡萄球菌(MRSA)分离物在NaHCO3存在下对苯唑西林和/或头孢唑啉表现出显著较低的最小抑制浓度(MIC)。在心内膜炎动物模型中,NaHCO3反应性与对β-内酰胺的治疗反应相关。我们调查了用β-内酰胺治疗对NaHCO3敏感的菌株是否与菌血症的更快清除有关。CAMERA2试验(耐甲氧西林金黄色葡萄球菌的联合抗生素)随机分配了MRSA血流感染的参与者接受标准治疗,或标准疗法加抗葡萄球菌β-内酰胺(联合疗法)。对于117个CAMERA2MRSA分离株,我们通过肉汤微量稀释测定头孢唑啉和苯唑西林的MIC,有和没有44mM的NaHCO3。在存在NaHCO3的情况下,对头孢唑啉或苯唑西林的MIC降低≥4倍的分离物被认为对该试剂具有“NaHCO3反应性”。我们比较了由NaHCO3反应性和非反应性菌株引起感染的参与者中持续性菌血症的发生率,并分配给β-内酰胺联合治疗。31%(36/117)和25%(21/85)的MRSA分离株对头孢唑啉和苯唑西林有NaHCO3反应,分别。NaHCO3反应表型与序列类型93、SCCmecIVa、和在调控区中的位置-7和-38处具有取代的mecA等位基因。在接受β-内酰胺治疗的参与者中,NaHCO3反应表型与持续性菌血症之间没有关联(头孢唑林,P=0.82;苯唑西林,P=0.81)。在MRSA血流感染的随机临床试验中,具有体外β-内酰胺-NaHCO3反应表型的分离株与独特的遗传特征相关,但在接受β-内酰胺治疗的患者中菌血症持续时间较短。
    NaHCO3 responsiveness is a novel phenotype where some methicillin-resistant Staphylococcus aureus (MRSA) isolates exhibit significantly lower minimal inhibitory concentrations (MIC) to oxacillin and/or cefazolin in the presence of NaHCO3. NaHCO3 responsiveness correlated with treatment response to β-lactams in an endocarditis animal model. We investigated whether treatment of NaHCO3-responsive strains with β-lactams was associated with faster clearance of bacteremia. The CAMERA2 trial (Combination Antibiotics for Methicillin-Resistant Staphylococcus aureus) randomly assigned participants with MRSA bloodstream infections to standard therapy, or to standard therapy plus an anti-staphylococcal β-lactam (combination therapy). For 117 CAMERA2 MRSA isolates, we determined by broth microdilution the MIC of cefazolin and oxacillin, with and without 44 mM of NaHCO3. Isolates exhibiting ≥4-fold decrease in the MIC to cefazolin or oxacillin in the presence of NaHCO3 were considered \"NaHCO3-responsive\" to that agent. We compared the rate of persistent bacteremia among participants who had infections caused by NaHCO3-responsive and non-responsive strains, and that were assigned to combination treatment with a β-lactam. Thirty-one percent (36/117) and 25% (21/85) of MRSA isolates were NaHCO3-responsive to cefazolin and oxacillin, respectively. The NaHCO3-responsive phenotype was significantly associated with sequence type 93, SCCmec type IVa, and mecA alleles with substitutions in positions -7 and -38 in the regulatory region. Among participants treated with a β-lactam, there was no association between the NaHCO3-responsive phenotype and persistent bacteremia (cefazolin, P = 0.82; oxacillin, P = 0.81). In patients from a randomized clinical trial with MRSA bloodstream infection, isolates with an in vitro β-lactam-NaHCO3-responsive phenotype were associated with distinctive genetic signatures, but not with a shorter duration of bacteremia among those treated with a β-lactam.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这个计算协议描述了如何使用pyPGCF,linux环境下运行的python软件包,为了分析细菌基因组并执行:(I)系统基因组分析,(ii)物种划分,(iii)鉴定细菌属的核心蛋白及其单个物种,(iv)鉴定在一个物种的所有菌株中发现的物种特异性指纹蛋白,同时,在该属的所有其他物种中都不存在,(v)用eggNOG对核心和指纹蛋白进行功能注释,和(vi)使用抗SMASH鉴定次级代谢产物生物合成基因簇(smBGC)。该软件已经用于分析对植物重要的细菌属和物种(例如,假单胞菌,芽孢杆菌,链霉菌)。此外,我们提供了一个测试数据集和示例命令,显示如何分析来自55种芽孢杆菌属的165个基因组。pyPGCF的主要优点是:(i)它使用可调节的矫形截止,(ii)它识别特定物种的指纹,和(iii)其计算成本与被分析的基因组数量成线性比例。因此,pyPGCF能够处理大量的细菌基因组,在合理的时间范围内,使用广泛可用的计算能力水平。
    This computational protocol describes how to use pyPGCF, a python software package that runs in the linux environment, in order to analyze bacterial genomes and perform: (i) phylogenomic analysis, (ii) species demarcation, (iii) identification of the core proteins of a bacterial genus and its individual species, (iv) identification of species-specific fingerprint proteins that are found in all strains of a species and, at the same time, are absent from all other species of the genus, (v) functional annotation of the core and fingerprint proteins with eggNOG, and (vi) identification of secondary metabolite biosynthetic gene clusters (smBGCs) with antiSMASH. This software has already been implemented to analyze bacterial genera and species that are important for plants (e.g., Pseudomonas, Bacillus, Streptomyces). In addition, we provide a test dataset and example commands showing how to analyze 165 genomes from 55 species of the genus Bacillus. The main advantages of pyPGCF are that: (i) it uses adjustable orthology cut-offs, (ii) it identifies species-specific fingerprints, and (iii) its computational cost scales linearly with the number of genomes being analyzed. Therefore, pyPGCF is able to deal with a very large number of bacterial genomes, in reasonable timescales, using widely available levels of computing power.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    来源归因传统上涉及将流行病学数据与不同的病原体表征方法相结合,包括7基因多位点序列分型(MLST)或血清分型,然而,这些方法的分辨率有限。相比之下,全基因组测序数据提供了可用于归因算法的全基因组的概述。这里,我们应用随机森林(RF)算法来预测人类临床鼠伤寒沙门氏菌(S.鼠伤寒沙门氏菌)和单相变体(单相鼠伤寒沙门氏菌)分离株。为此,我们利用从1,061个实验室证实的人和动物鼠伤寒沙门氏菌和单相鼠伤寒沙门氏菌分离株获得的核心基因组MLST等位基因中的单核苷酸多态性多样性作为RF模型的输入.该算法用于监督学习,将399只动物鼠伤寒沙门氏菌和单相鼠伤寒沙门氏菌分离株分为八个不同的主要来源类别之一,包括常见的牲畜和宠物动物物种:牛,猪,绵羊,其他哺乳动物(宠物:主要是狗和马),肉鸡,图层,火鸡,和野鸟(野鸡,鹌鹑,和鸽子)。当应用于训练组动物分离物时,模型准确性为0.929和κ0.905,而对于测试集动物分离株,从模型中保留了主要的源类信息,准确度为0.779,kappa为0.700.随后,该模型用于将662例人类临床病例分配到8个主要来源类别中.在数据集中,60/399(15.0%)的动物和141/662(21.3%)的人类分离株与已知的鼠伤寒沙门氏菌确定型(DT)104爆发有关。该模型将141个DT104爆发中的两个与人类分离株正确地归因于确定为DT104爆发起源的主要来源类别。在没有克隆DT104动物分离株的情况下运行的模型产生了很大程度上一致的输出(训练集准确性0.989和κ0.985;测试集准确性0.781和κ0.663)。总的来说,我们的研究结果表明,RF作为食源性病原体流行病学追踪和来源归因的合适方法提供了相当大的前景.
    Source attribution has traditionally involved combining epidemiological data with different pathogen characterisation methods, including 7-gene multi locus sequence typing (MLST) or serotyping, however, these approaches have limited resolution. In contrast, whole genome sequencing data provide an overview of the whole genome that can be used by attribution algorithms. Here, we applied a random forest (RF) algorithm to predict the primary sources of human clinical Salmonella Typhimurium (S. Typhimurium) and monophasic variants (monophasic S. Typhimurium) isolates. To this end, we utilised single nucleotide polymorphism diversity in the core genome MLST alleles obtained from 1,061 laboratory-confirmed human and animal S. Typhimurium and monophasic S. Typhimurium isolates as inputs into a RF model. The algorithm was used for supervised learning to classify 399 animal S. Typhimurium and monophasic S. Typhimurium isolates into one of eight distinct primary source classes comprising common livestock and pet animal species: cattle, pigs, sheep, other mammals (pets: mostly dogs and horses), broilers, layers, turkeys, and game birds (pheasants, quail, and pigeons). When applied to the training set animal isolates, model accuracy was 0.929 and kappa 0.905, whereas for the test set animal isolates, for which the primary source class information was withheld from the model, the accuracy was 0.779 and kappa 0.700. Subsequently, the model was applied to assign 662 human clinical cases to the eight primary source classes. In the dataset, 60/399 (15.0%) of the animal and 141/662 (21.3%) of the human isolates were associated with a known outbreak of S. Typhimurium definitive type (DT) 104. All but two of the 141 DT104 outbreak linked human isolates were correctly attributed by the model to the primary source classes identified as the origin of the DT104 outbreak. A model that was run without the clonal DT104 animal isolates produced largely congruent outputs (training set accuracy 0.989 and kappa 0.985; test set accuracy 0.781 and kappa 0.663). Overall, our results show that RF offers considerable promise as a suitable methodology for epidemiological tracking and source attribution for foodborne pathogens.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    最低抑制浓度(MIC)是定量测量抗生素抗性的黄金标准。然而,基于实验室的MIC测定可能耗时且重现性低,而敏感或抗性的解释依赖于随着时间的推移而变化的指导方针。基因组测序和机器学习有望允许计算机MIC预测作为克服其中一些困难的替代方法。尽管仍然需要对MIC进行解释。然而,在处理预测模型时,我们应该如何处理MIC数据仍然不清楚,由于它们是半定量测量的,不同的分辨率,并且通常在不同的范围内进行左和右审查。因此,我们使用具有模拟半数量性状和真实MIC的4367个基因组,研究了病原体肺炎克雷伯菌中基于基因组的MIC预测。当我们专注于临床解释时,我们使用的是可解释的而不是黑箱机器学习模型,即,ElasticNet,随机森林,和线性混合模型。产生模拟性状,多基因,具有不同遗传水平的同质遗传效应。然后,我们评估了当MIC被框架为回归和分类时,模型预测准确性如何受到影响。我们的结果表明,根据可用抗生素的浓度水平的数量来不同地处理MIC是最有前途的学习策略。具体来说,为了优化预测准确性和正确因果变量的推断,当观察到的抗生素浓度水平很大时,我们建议将MIC视为连续的,并将学习问题作为回归的框架。而在浓度水平较少的情况下,它们应被视为分类变量,而学习问题应被视为分类。我们的发现还强调了当考虑到先验生物学知识时,预测模型如何得到改进,由于每个抗生素抗性性状的遗传结构不同。最后,我们强调,增加人口数据库对于这些模型的未来临床实施至关重要,以支持基于机器学习的常规诊断.
    Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    功能基因组学技术,如转座子插入测序和RNA测序,是研究选择性条件下细菌突变体适应性或基因表达的相对差异的关键。然而,某些应力条件,突变,或者抗生素可以直接干扰DNA合成,导致染色体上局部DNA拷贝数的系统变化。当将抗生素治疗与无应激对照进行比较时,这可能导致基于测序的功能基因组学数据中的伪影。Further,基因读取计数的相对差异可能是由于染色体复制动力学的改变,而不是选择或直接的基因调控。我们将此伪影称为“染色体位置偏差”,并通过计算沿染色体的局部归一化因子来实施原则性统计方法来纠正它。然后使用标准RNA测序分析方法将这些标准化因子直接纳入统计分析中,而无需修改读取计数本身。保留数据中关于均值-方差关系的重要信息。我们通过在大肠杆菌中生成和分析环丙沙星处理的转座子插入测序数据集作为案例研究来说明这种方法的实用性。我们表明环丙沙星治疗会在所得数据中产生染色体位置偏差,我们进一步证明,未能纠正这种偏倚会导致通过最小抑制浓度测量的突变药物敏感性的错误预测。我们已经开发了一个R包和用户友好的图形闪亮应用程序,染色体正确,检测和校正读取计数数据中的染色体偏倚,使功能基因组学技术能够应用于抗生素应激的研究。IMPORTANCEAltered基因剂量由于DNA复制的变化已在各种压力下观察到各种实验技术。然而,很少考虑基因剂量变化对基于测序的功能基因组学测定的影响.我们提出了一种统计学原理的方法来校正基因剂量变化的影响,能够在DNA拷贝数存在混杂差异的情况下测试个体基因的适应性效应或调节的差异。我们表明,当应用功能基因组学测定来研究抗生素应激时,未能纠正这些影响可能导致抗性表型的错误预测,我们提供了一个用户友好的应用程序来检测和纠正DNA拷贝数的变化。
    Functional genomics techniques, such as transposon insertion sequencing and RNA-sequencing, are key to studying relative differences in bacterial mutant fitness or gene expression under selective conditions. However, certain stress conditions, mutations, or antibiotics can directly interfere with DNA synthesis, resulting in systematic changes in local DNA copy numbers along the chromosome. This can lead to artifacts in sequencing-based functional genomics data when comparing antibiotic treatment to an unstressed control. Further, relative differences in gene-wise read counts may result from alterations in chromosomal replication dynamics, rather than selection or direct gene regulation. We term this artifact \"chromosomal location bias\" and implement a principled statistical approach to correct it by calculating local normalization factors along the chromosome. These normalization factors are then directly incorporated into statistical analyses using standard RNA-sequencing analysis methods without modifying the read counts themselves, preserving important information about the mean-variance relationship in the data. We illustrate the utility of this approach by generating and analyzing a ciprofloxacin-treated transposon insertion sequencing data set in Escherichia coli as a case study. We show that ciprofloxacin treatment generates chromosomal location bias in the resulting data, and we further demonstrate that failing to correct for this bias leads to false predictions of mutant drug sensitivity as measured by minimum inhibitory concentrations. We have developed an R package and user-friendly graphical Shiny application, ChromoCorrect, that detects and corrects for chromosomal bias in read count data, enabling the application of functional genomics technologies to the study of antibiotic stress.IMPORTANCEAltered gene dosage due to changes in DNA replication has been observed under a variety of stresses with a variety of experimental techniques. However, the implications of changes in gene dosage for sequencing-based functional genomics assays are rarely considered. We present a statistically principled approach to correcting for the effect of changes in gene dosage, enabling testing for differences in the fitness effects or regulation of individual genes in the presence of confounding differences in DNA copy number. We show that failing to correct for these effects can lead to incorrect predictions of resistance phenotype when applying functional genomics assays to investigate antibiotic stress, and we provide a user-friendly application to detect and correct for changes in DNA copy number.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    有效控制军团病爆发的基础是快速识别致病因素的环境来源的能力,嗜肺军团菌。基因组学彻底改变了病原体监测,但是嗜肺乳杆菌具有复杂的生态学和种群结构,可以限制基于标准核心基因组系统发育的来源推断。这里,我们提出了一种强大的机器学习方法,该方法比当前的核心基因组比较更准确地分配军团病爆发的地理来源。模型是根据534个嗜肺乳杆菌基因组序列开发的,通过详细的病例调查,包括与20例先前报告的军团病暴发相关的149个基因组。我们的分类模型是在仅使用环境嗜肺乳杆菌基因组的交叉验证框架中开发的。临床分离物地理来源的分配显示了模型的高预测敏感性和特异性,在20个爆发群体中,有13个没有假阳性或假阴性,尽管存在爆发内多克隆种群结构。使用常规系统基因组树和基于核心基因组多基因座序列类型等位基因距离的分类方法对相同的534基因组面板进行分析,表明我们的机器学习方法与流行病学信息具有最高的总体分类性能-一致性。我们的多变量统计学习方法最大限度地利用基因组变异数据,因此非常适合支持军团病爆发调查。重要意义识别军团病爆发的来源对于有效控制至关重要。目前的基因组方法,虽然有用,由于嗜肺军团菌复杂的生态和种群结构,病原体。我们的研究引入了一种高性能的机器学习方法,以更准确地对军团病爆发进行地理来源归因。使用环境嗜肺乳杆菌基因组的交叉验证开发,我们的模型显示出优异的预测敏感性和特异性.重要的是,这种新方法优于传统方法,如系统基因组树和核心基因组多位点序列分型,证明在利用基因组变异数据推断爆发源方面更有效。我们的机器学习算法,利用核心和辅助基因组变异,在公共卫生环境中提供重大承诺。通过在军团病暴发中实现快速和精确的来源识别,这种方法有可能加快干预工作并减少疾病传播。
    Fundamental to effective Legionnaires\' disease outbreak control is the ability to rapidly identify the environmental source(s) of the causative agent, Legionella pneumophila. Genomics has revolutionized pathogen surveillance, but L. pneumophila has a complex ecology and population structure that can limit source inference based on standard core genome phylogenetics. Here, we present a powerful machine learning approach that assigns the geographical source of Legionnaires\' disease outbreaks more accurately than current core genome comparisons. Models were developed upon 534 L. pneumophila genome sequences, including 149 genomes linked to 20 previously reported Legionnaires\' disease outbreaks through detailed case investigations. Our classification models were developed in a cross-validation framework using only environmental L. pneumophila genomes. Assignments of clinical isolate geographic origins demonstrated high predictive sensitivity and specificity of the models, with no false positives or false negatives for 13 out of 20 outbreak groups, despite the presence of within-outbreak polyclonal population structure. Analysis of the same 534-genome panel with a conventional phylogenomic tree and a core genome multi-locus sequence type allelic distance-based classification approach revealed that our machine learning method had the highest overall classification performance-agreement with epidemiological information. Our multivariate statistical learning approach maximizes the use of genomic variation data and is thus well-suited for supporting Legionnaires\' disease outbreak investigations.IMPORTANCEIdentifying the sources of Legionnaires\' disease outbreaks is crucial for effective control. Current genomic methods, while useful, often fall short due to the complex ecology and population structure of Legionella pneumophila, the causative agent. Our study introduces a high-performing machine learning approach for more accurate geographical source attribution of Legionnaires\' disease outbreaks. Developed using cross-validation on environmental L. pneumophila genomes, our models demonstrate excellent predictive sensitivity and specificity. Importantly, this new approach outperforms traditional methods like phylogenomic trees and core genome multi-locus sequence typing, proving more efficient at leveraging genomic variation data to infer outbreak sources. Our machine learning algorithms, harnessing both core and accessory genomic variation, offer significant promise in public health settings. By enabling rapid and precise source identification in Legionnaires\' disease outbreaks, such approaches have the potential to expedite intervention efforts and curtail disease transmission.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    黄单胞菌半透明,谷物中细菌性条纹病(BLS)的病原体,是一种重新出现的病原体,在世界各地的破坏性越来越大。虽然BLS在过去造成了产量损失,有轶事证据表明,较新的分离株可能更具毒性。我们观察到从科罗拉多州的两个地点收集的两个X.半透明分离株,美国,与较老的分离株相比,对目前的小麦和大麦品种更具侵略性,我们假设最近和更老的分离株之间的遗传变化有助于分离株侵袭性的差异。为了测试这个,我们对2018年从科罗拉多州收集的两个X.半透明分离株进行了表型和遗传表征,我们将其命名为CO236(来自大麦)和CO237(来自小麦).使用病态特异性表型鉴定和PCR引物,我们确定CO236属于pathovar半透明(Xtt),CO237属于pathovarundulosa(Xtu)。我们使用牛津纳米孔长读数测序对分离物的完整基因组进行了测序,并将他们的整个基因组与已发表的X.半透明基因组进行了比较。这项分析证实了我们对XttCO236和XtuCO237的pathovar名称,并表明,在全基因组水平上,XttCO236和XtuCO237与其他已发表的病理基因组之间没有明显的基因组结构变化。专注于pathovarundulosa(XtuCO237),然后,我们在所有可用的Xtu分离株基因组中比较了推定的III型效应子,发现它们高度保守.然而,XtuCO237和已发表的undulosa基因组之间的各种转录激活因子样效应子的存在和序列存在显着差异,这与分离株的毒力有关。这里,我们探索这些毒力因子差异的潜在影响,并为最近出现的分离株的毒力增加提供可能的解释。
    Xanthomonas translucens, the causal agent of bacterial leaf streak disease (BLS) in cereals, is a re-emerging pathogen that is becoming increasingly destructive across the world. While BLS has caused yield losses in the past, there is anecdotal evidence that newer isolates may be more virulent. We observed that two X. translucens isolates collected from two sites in Colorado, USA, are more aggressive on current wheat and barley varieties compared to older isolates, and we hypothesize that genetic changes between recent and older isolates contribute to the differences in isolate aggressiveness. To test this, we phenotyped and genetically characterized two X. translucens isolates collected from Colorado in 2018, which we designated CO236 (from barley) and CO237 (from wheat). Using pathovar-specific phenotyping and PCR primers, we determined that CO236 belongs to pathovar translucens (Xtt) and CO237 belongs to pathovar undulosa (Xtu). We sequenced the full genomes of the isolates using Oxford Nanopore long-read sequencing, and compared their whole genomes against published X. translucens genomes. This analysis confirmed our pathovar designations for Xtt CO236 and Xtu CO237, and showed that, at the whole-genome level, there were no obvious genomic structural changes between Xtt CO236 and Xtu CO237 and other respective published pathovar genomes. Focusing on pathovar undulosa (Xtu CO237), we then compared putative type III effectors among all available Xtu isolate genomes and found that they were highly conserved. However, there were striking differences in the presence and sequence of various transcription activator-like effectors between Xtu CO237 and published undulosa genomes, which correlate with isolate virulence. Here, we explore the potential implications of the differences in these virulence factors, and provide possible explanations for the increased virulence of recently emerged isolates.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    产志贺毒素的大肠杆菌(STEC)O157:H7菌株中具有T等位基因的易位内膜蛋白受体多态性(tir)255A>T基因与人类疾病的相关性比具有A等位基因的菌株更多;但是,等位基因不被认为是这种差异的直接原因。我们对一组不同的STECO157:H7菌株(26%A等位基因,74%T等位基因),以确定可能是疾病关联基础的连锁差异。在tir255A等位基因菌株内,平均染色体和pO157质粒大小和基因含量明显更大。18个编码序列是tir255A等位基因染色体所特有的,三个是tir255T等位基因染色体特有的。还存在非pO157质粒,其对于每个tir255等位基因变体是独特的。TIR255等位基因菌株之间的总平均数量没有差异;但是,菌株之间有不同的类型。与tir255多态性相关的基因组和移动元件变异可能是人类疾病中T等位基因分离株频率增加的原因。
    Shiga toxin-producing Escherichia coli (STEC) O157:H7 strains with the T allele in the translocated intimin receptor polymorphism (tir) 255 A > T gene associate with human disease more than strains with an A allele; however, the allele is not thought to be the direct cause of this difference. We sequenced a diverse set of STEC O157:H7 strains (26% A allele, 74% T allele) to identify linked differences that might underlie disease association. The average chromosome and pO157 plasmid size and gene content were significantly greater within the tir 255 A allele strains. Eighteen coding sequences were unique to tir 255 A allele chromosomes, and three were unique to tir 255 T allele chromosomes. There also were non-pO157 plasmids that were unique to each tir 255 allele variant. The overall average number of prophages did not differ between tir 255 allele strains; however, there were different types between the strains. Genomic and mobile element variation linked to the tir 255 polymorphism may account for the increased frequency of the T allele isolates in human disease.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    由耐甲氧西林金黄色葡萄球菌(MRSA)引起的医院血流感染(BSI)是发病率和死亡率的主要原因,通常与侵入性手术和医学复杂的患者有关。MRSA的重要特征是其种群的克隆构造。特定的MRSA克隆可能在其致病性上有所不同,流行病学,和抗菌素耐药性概况。全基因组测序是目前用于追踪高毒力/适应性好的MRSA克隆的最强大和最具歧视性的技术。然而,它仍然是一种昂贵且耗时的技术,需要专门的人员。在这项工作中,我们描述了一个pangenome协议,基于开放阅读框(ORF)的二进制矩阵(1,0),可以用来快速找到诊断,可以作为生物标志物的拟态序列突变。我们使用这种技术为里约热内卢都会区的MRSA分离株创建诊断屏幕,RdJ克隆,这在BSI中很普遍。这里描述的方法具有100%的特异性和灵敏度,消除了使用基因组测序进行克隆鉴定的需要。使用的协议相对简单,所有的步骤,在这项工作中描述了使用的公式和命令,这样该策略也可用于鉴定其他MRSA克隆,甚至来自其他细菌物种的克隆。
    Hospital bloodstream infection (BSI) caused by methicillin-resistant Staphylococcus aureus (MRSA) is a major cause of morbidity and mortality and is frequently related to invasive procedures and medically complex patients. An important feature of MRSA is the clonal structure of its population. Specific MRSA clones may differ in their pathogenic, epidemiological, and antimicrobial resistance profiles. Whole-genome sequencing is currently the most robust and discriminatory technique for tracking hypervirulent/well-adapted MRSA clones. However, it remains an expensive and time-consuming technique that requires specialized personnel. In this work, we describe a pangenome protocol, based on binary matrix (1,0) of open reading frames (ORFs), that can be used to quickly find diagnostic, apomorphic sequence mutations that can serve as biomarkers. We use this technique to create a diagnostic screen for MRSA isolates circulating in the Rio de Janeiro metropolitan area, the RdJ clone, which is prevalent in BSI. The method described here has 100% specificity and sensitivity, eliminating the need to use genomic sequencing for clonal identification. The protocol used is relatively simple and all the steps, formulas and commands used are described in this work, such that this strategy can also be used to identify other MRSA clones and even clones from other bacterial species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Review
    背景:DNA测序技术的进步已经改变了细菌基因组学领域,与十年前相比,允许更快、更经济的染色体水平组装。然而,由于从不同测序仪器获得的数据的质量和数量各不相同,因此将原始读数转换为完整的基因组模型是一项重大的计算挑战。以及基因组的内在特征和所需的分析。为了解决这个问题,我们使用Nextflow开发了一组基于容器的管道,为没有经验的用户提供共同的工作流程,并为有经验的用户提供高级别的定制。他们的处理策略是基于测序数据类型的适应性,和他们的模块化使新的组件的合并,以满足社区的不断变化的需求。方法:这些管道包括三个部分:质量控制,从头基因组组装,和细菌基因组注释。特别是,基因组注释管道提供了基因组的全面概述,包括标准基因预测和功能推断,以及与临床应用相关的预测,如毒力和抗性基因注释,次生代谢物检测,propage和质粒预测,还有更多.成果:成果注解成果在报导中,基因组浏览器,和基于Web的应用程序,使用户能够探索基因组注释结果并与之交互。结论:总体而言,我们的用户友好的管道提供了计算工具的无缝集成,以促进常规细菌基因组学研究。通过检查肺炎克雷伯菌临床样品的测序数据来说明这些方法的有效性。
    Background: Advancements in DNA sequencing technology have transformed the field of bacterial genomics, allowing for faster and more cost effective chromosome level assemblies compared to a decade ago. However, transforming raw reads into a complete genome model is a significant computational challenge due to the varying quality and quantity of data obtained from different sequencing instruments, as well as intrinsic characteristics of the genome and desired analyses. To address this issue, we have developed a set of container-based pipelines using Nextflow, offering both common workflows for inexperienced users and high levels of customization for experienced ones. Their processing strategies are adaptable based on the sequencing data type, and their modularity enables the incorporation of new components to address the community\'s evolving needs. Methods: These pipelines consist of three parts: quality control, de novo genome assembly, and bacterial genome annotation. In particular, the genome annotation pipeline provides a comprehensive overview of the genome, including standard gene prediction and functional inference, as well as predictions relevant to clinical applications such as virulence and resistance gene annotation, secondary metabolite detection, prophage and plasmid prediction, and more. Results: The annotation results are presented in reports, genome browsers, and a web-based application that enables users to explore and interact with the genome annotation results. Conclusions: Overall, our user-friendly pipelines offer a seamless integration of computational tools to facilitate routine bacterial genomics research. The effectiveness of these is illustrated by examining the sequencing data of a clinical sample of Klebsiella pneumoniae.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号