sequencing depth

测序深度
  • 文章类型: Journal Article
    产生了来自新鲜菠菜和地表水的模拟细菌群落的牛津纳米孔长读数(R9.4.1SQK-LSK109和R10.4SQK-LSK112;0.5,一个,和两百万次读取)。肠道沙门氏菌血清型海德堡,蒙得维的亚,或鼠伤寒单独或组合包括在菠菜群落中,而水体中含有铜绿假单胞菌。
    Oxford Nanopore long reads of simulated bacterial communities from fresh spinach and surface water were generated (R9.4.1+SQK-LSK109 and R10.4+SQK-LSK112; 0.5, one, and two million reads). Salmonella enterica serotype Heidelberg, Montevideo, or Typhimurium was included alone or in combination in the spinach community, while the water community harbored Pseudomonas aeruginosa.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    随着宏基因组测序的数量不断增加,越来越需要帮助生物学家理解数据的工具。具体来说,研究人员通常对微生物群落进行代谢反应的潜力感兴趣,但是这种分析需要将多个软件工具编织成一个复杂的管道。Thanos提供了一个用户友好的R包,设计用于以途径为中心的分析和宏基因组样品中编码的功能的可视化。它使研究人员能够超越分类学轮廓,发现,定量,哪些途径在环境中普遍存在,以及比较不同环境的功能潜力。该分析基于感兴趣基因的测序深度,在宏基因组组装的基因组(MAG)或组装的读段(重叠群)中,使用标准化策略,实现跨样本的比较。该软件包可以从多种格式导入数据,并提供将结果可视化为功能配置文件的条形图的功能,跨样本的比较函数的箱形图,和带注释的路径图。通过简化对微生物群落中编码的功能潜力的分析,Thanos可以在宏基因组学涉及的所有领域实现有影响力的发现,从人类健康到环境科学。
    As the amount of metagenomic sequencing continues to increase, there is a growing need for tools that help biologists make sense of the data. Specifically, researchers are often interested in the potential of a microbial community to carry out a metabolic reaction, but this analysis requires knitting together multiple software tools into a complex pipeline. Thanos offers a user-friendly R package designed for the pathway-centric analysis and visualization of the functions encoded within metagenomic samples. It allows researchers to go beyond taxonomic profiles and find out, quantitatively, which pathways are prevalent in an environment, as well as comparing different environments in terms of their functional potential. The analysis is based on the sequencing depth of the genes of interest, either in the metagenome-assembled genomes (MAGs) or in the assembled reads (contigs), using a normalization strategy that enables comparison across samples. The package can import the data from multiple formats and offers functions for the visualization of the results as bar plots of the functional profile, box plots of compare functions across samples, and annotated pathway graphs. By streamlining the analysis of the functional potential encoded in microbial communities, Thanos can enable impactful discoveries in all the fields touched by metagenomics, from human health to the environmental sciences.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    IlluminaHiSeq的配对短读,MiSeq,和NovaSeq的模拟细菌群落来自新鲜菠菜和地表水在不同测序深度的计算机上产生。多药耐药的肠道沙门氏菌血清型印第安纳州被纳入菠菜社区,而水体中含有多重耐药的铜绿假单胞菌。
    Paired-end short reads of Illumina HiSeq, MiSeq, and NovaSeq of simulated bacterial communities from fresh spinach and surface water were generated in silico at various sequencing depths. Multidrug-resistant Salmonella enterica serotype Indiana was included in the spinach community, while the water community contained multidrug-resistant Pseudomonas aeruginosa.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    鸟枪宏基因组学测序实验正在发现广泛的应用。尽管如此,关于获取有意义的信息以进行分类学分析和抗微生物药物耐药基因(ARG)鉴定所需的序列数量的指南仍然有限.在这项研究中,我们在口腔微生物群的背景下探索了这个问题,通过使用非常高数量的序列(约1亿条)进行测序,四个人斑块样本,和一个微生物群落标准,并通过降采样程序评估微生物鉴定和ARGs检测的性能。当调查减少数量的序列对微生物群落标准数据集的定量分类分析的影响时,与预期相比,我们发现已确定的微生物种类及其丰度存在一些差异。这种差异在整个向下抽样中是一致的,表明它们与分类学分析方法限制的联系。总的来说,结果表明,序列的数量对宏基因组样本在定性(即,存在/不存在)信息丢失的水平,尤其是在阅读量不到4000万次的实验中,而丰度估计受到的影响最小,在低丰度物种中仅观察到微小的变化。还评估了ARGs的存在:总共鉴定了133个ARGs。值得注意的是,其中23%的结果不一致,在同一样本的下采样数据集中存在或不存在。此外,超过一半的ARG在阅读量少于2000万的数据集中丢失。这项研究强调了仔细考虑测序方面的重要性,并提出了一些设计鸟枪宏基因组学实验的指南,最终目标是最大化口腔微生物组分析。我们的研究结果表明,根据不同的研究目标,不同的优化序列号:4000万用于微生物区系分析,5000万用于低丰度物种检测,和2000万用于ARG识别。关键点:•四千万个序列是用于微生物区系分析的成本有效的解决方案•五千万个序列允许低丰度物种检测•两千万个序列被推荐用于ARG鉴定。
    Shotgun metagenomics sequencing experiments are finding a wide range of applications. Nonetheless, there are still limited guidelines regarding the number of sequences needed to acquire meaningful information for taxonomic profiling and antimicrobial resistance gene (ARG) identification. In this study, we explored this issue in the context of oral microbiota by sequencing with a very high number of sequences (~ 100 million), four human plaque samples, and one microbial community standard and by evaluating the performance of microbial identification and ARGs detection through a downsampling procedure. When investigating the impact of a decreasing number of sequences on quantitative taxonomic profiling in the microbial community standard datasets, we found some discrepancies in the identified microbial species and their abundances when compared to the expected ones. Such differences were consistent throughout downsampling, suggesting their link to taxonomic profiling methods limitations. Overall, results showed that the number of sequences has a great impact on metagenomic samples at the qualitative (i.e., presence/absence) level in terms of loss of information, especially in experiments having less than 40 million reads, whereas abundance estimation was minimally affected, with only slight variations observed in low-abundance species. The presence of ARGs was also assessed: a total of 133 ARGs were identified. Notably, 23% of them inconsistently resulted as present or absent across downsampling datasets of the same sample. Moreover, over half of ARGs were lost in datasets having less than 20 million reads. This study highlights the importance of carefully considering sequencing aspects and suggests some guidelines for designing shotgun metagenomics experiments with the final goal of maximizing oral microbiome analyses. Our findings suggest varying optimized sequence numbers according to different study aims: 40 million for microbiota profiling, 50 million for low-abundance species detection, and 20 million for ARG identification. KEY POINTS: • Forty million sequences are a cost-efficient solution for microbiota profiling • Fifty million sequences allow low-abundance species detection • Twenty million sequences are recommended for ARG identification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    牛津纳米孔测序是促进宏基因组组装基因组(MAG)重建的高通量测序技术之一。这项研究旨在评估长读组装算法在牛津纳米孔测序中的潜力,以使用模拟和模拟社区增强基于MAG的细菌病原体鉴定。生成模拟群落以模拟新鲜菠菜和地表水中的群落。使用R9.4.1+SQK-LSK109和R10.4+SQK-LSK112产生长读数,具有0.5、1和2百万个读数。模拟的细菌群落包括多重耐药的肠道沙门氏菌血清型海德堡,蒙得维的亚,和新鲜菠菜群落中的鼠伤寒菌单独或组合,以及地表水群落中的多重耐药铜绿假单胞菌。还研究了ZymoBIOMICSHMWDNA标准的真实数据集。生物信息学管道(MAGenie,免费提供在https://github.com/jackchen129/MAGenie)结合宏基因组组装,分类学分类,并开发了序列提取来重建宏基因组组装中的MAG草案。基于一系列基因组分析评估了五个组装者。总的来说,弗莱的表现优于其他装配商,紧随其后的是沙斯塔,Raven,还有Uniculer,而Canu的表现最差。在某些情况下,提取的序列产生了MAG草案,并提供了抗菌素抗性基因和可移动遗传元件的位置和结构.我们的研究展示了利用提取的序列进行精确的系统发育推断的可行性,正如参考基因组和提取序列之间的系统发生拓扑结构的一致比对所证明的那样。在大多数情况下,R9.4.1+SQK-LSK109比R10.4+SQK-LSK112更有效,并且更大的测序深度通常导致更准确的结果。重要性通过检查不同的细菌群落,特别是那些拥有多种肠道沙门氏菌血清型的人,这项研究对于揭示长读数组装算法通过牛津纳米孔测序改善基于宏基因组组装基因组(MAG)的病原体鉴定的潜力具有重要意义.我们的研究表明,长阅读组装是提高基于MAG的病原体鉴定精度的有希望的途径,从而推进更强有力的监测措施的发展。这些发现还支持正在进行的努力,以微调生物信息学管道,以在复杂的宏基因组样品中进行准确的病原体鉴定。
    Oxford Nanopore sequencing is one of the high-throughput sequencing technologies that facilitates the reconstruction of metagenome-assembled genomes (MAGs). This study aimed to assess the potential of long-read assembly algorithms in Oxford Nanopore sequencing to enhance the MAG-based identification of bacterial pathogens using both simulated and mock communities. Simulated communities were generated to mimic those on fresh spinach and in surface water. Long reads were produced using R9.4.1+SQK-LSK109 and R10.4 + SQK-LSK112, with 0.5, 1, and 2 million reads. The simulated bacterial communities included multidrug-resistant Salmonella enterica serotypes Heidelberg, Montevideo, and Typhimurium in the fresh spinach community individually or in combination, as well as multidrug-resistant Pseudomonas aeruginosa in the surface water community. Real data sets of the ZymoBIOMICS HMW DNA Standard were also studied. A bioinformatic pipeline (MAGenie, freely available at https://github.com/jackchen129/MAGenie) that combines metagenome assembly, taxonomic classification, and sequence extraction was developed to reconstruct draft MAGs from metagenome assemblies. Five assemblers were evaluated based on a series of genomic analyses. Overall, Flye outperformed the other assemblers, followed by Shasta, Raven, and Unicycler, while Canu performed least effectively. In some instances, the extracted sequences resulted in draft MAGs and provided the locations and structures of antimicrobial resistance genes and mobile genetic elements. Our study showcases the viability of utilizing the extracted sequences for precise phylogenetic inference, as demonstrated by the consistent alignment of phylogenetic topology between the reference genome and the extracted sequences. R9.4.1+SQK-LSK109 was more effective in most cases than R10.4+SQK-LSK112, and greater sequencing depths generally led to more accurate results.IMPORTANCEBy examining diverse bacterial communities, particularly those housing multiple Salmonella enterica serotypes, this study holds significance in uncovering the potential of long-read assembly algorithms to improve metagenome-assembled genome (MAG)-based pathogen identification through Oxford Nanopore sequencing. Our research demonstrates that long-read assembly stands out as a promising avenue for boosting precision in MAG-based pathogen identification, thus advancing the development of more robust surveillance measures. The findings also support ongoing endeavors to fine-tune a bioinformatic pipeline for accurate pathogen identification within complex metagenomic samples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:评估非侵入性产前检测(NIPT)和扩展非侵入性产前检测(NIPT-plus),以检测不同测序深度的非整倍体,并评估Z评分在预测三体21、18、13、45X中的准确性。47XXX。
    方法:将在南方医院产前诊断中心检测到的NIPT或NIPT+结果阳性的孕妇纳入本回顾性研究。2017年1月至2022年12月。收集侵入性产前诊断结果。采用Logistic回归分析研究Z评分与阳性预测值(PPV)的关系。基于接收机工作特性分析,得到了最佳截止值,计算不同组的PPVs。
    结果:我们评估了1348名阳性结果的孕妇,包括NIPT报告的930和NIPT+报告的418。NIPT报道了明显更罕见的染色体非整倍体(RCAs),对于21三体(T21),NIPT+有明显更高的PPV。Logistic回归分析显示,T21和18三体(T18)的Z评分与PPV之间存在显着关联(P<0.001)。在T21和T18的真阳性病例中,胎儿分数(FF)与Z值之间存在线性关系。对于T21,T18,13三体和47XXX,高Z评分组的PPV明显高于低Z评分组,但不是45X。
    结论:Z评分有助于评估NIPT或NIPT+结果。因此,我们建议在结果中加入Z评分和FF.通过组合Z分数,FF,和产妇年龄,临床医生可以更准确地解释NIPT结果,并改善个人咨询,以减少患者的焦虑。
    OBJECTIVE: To evaluate non-invasive prenatal testing (NIPT) and expanded non-invasive prenatal testing (NIPT-plus) for detecting aneuploidies at different sequencing depths and assess Z-score accuracy in predicting trisomies 21, 18, 13, 45X, and 47XXX.
    METHODS: Pregnancies with positive NIPT or NIPT-plus results detected at the prenatal diagnosis center of Nanfang Hospital were included in this retrospective study, between January 2017 and December 2022. Invasive prenatal diagnostic results were collected. Logistic regression analyses were used to study the relationship between Z-score and positive predictive value (PPV). Optimal cut-off values were obtained based on receiver operating characteristic analysis, and PPVs were calculated in different groups.
    RESULTS: We evaluated 1348 pregnant women with positive results, including 930 reported by NIPT and 418 reported by NIPT-plus. NIPT reported significantly more rare chromosomal aneuploidies (RCAs), and NIPT-plus had a significantly higher PPV for trisomy 21 (T21). Logistic regression analyses showed a significant association (P < 0.001) between Z-score and PPVs for T21 and trisomy 18 (T18). A linear relationship was observed between fetal fraction (FF) and Z-values in the true positive cases of T21 and T18.The high Z-score group had significantly higher PPVs than the low Z-score group for T21, T18, trisomy 13, and 47XXX, but not for 45X.
    CONCLUSIONS: The Z-score is helpful in assessing NIPT or NIPT-plus results. Therefore, we suggest including the Z-score and FF in the results. By combining the Z-score, FF, and maternal age, clinicians can interpret NIPT results more accurately and improve personal counsel to reduce patients\' anxiety.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:详细描述了对来自Illumina下一代测序(NGS)的组件的连续性和准确性产生不利影响的参数。然而,过去的研究通常集中在它们的加性效应上,忽略它们的潜在相互作用,可能以倍增的方式加剧彼此的影响。为了调查它们是否对从头基因组组装质量起相互作用,我们模拟了13个细菌参考基因组的测序数据,随着错误率水平的变化,测序深度,PCR和光学重复比。
    结果:我们从模拟的测序数据中评估了组件的质量,并使用了一些连续性和准确性指标,我们用它来量化四个参数的加性和乘法效应。我们发现测试的参数参与复杂的相互作用,发挥乘法,而不是添加剂,对装配质量的影响。此外,原始基因组的非重复区域的比率和GC%可以决定四个参数如何影响组装质量。
    结论:我们提供了一个框架,供未来研究使用细菌基因组的从头基因组组装,例如,在选择最佳测序深度时,由于其与错误率的相互作用,它对连续性的积极影响和对准确性的消极影响之间的平衡。此外,还应考虑要测序的基因组的特性,因为它们可能会影响错误源本身的影响。
    BACKGROUND: Parameters adversely affecting the contiguity and accuracy of the assemblies from Illumina next-generation sequencing (NGS) are well described. However, past studies generally focused on their additive effects, overlooking their potential interactions possibly exacerbating one another\'s effects in a multiplicative manner. To investigate whether or not they act interactively on de novo genome assembly quality, we simulated sequencing data for 13 bacterial reference genomes, with varying levels of error rate, sequencing depth, PCR and optical duplicate ratios.
    RESULTS: We assessed the quality of assemblies from the simulated sequencing data with a number of contiguity and accuracy metrics, which we used to quantify both additive and multiplicative effects of the four parameters. We found that the tested parameters are engaged in complex interactions, exerting multiplicative, rather than additive, effects on assembly quality. Also, the ratio of non-repeated regions and GC% of the original genomes can shape how the four parameters affect assembly quality.
    CONCLUSIONS: We provide a framework for consideration in future studies using de novo genome assembly of bacterial genomes, e.g. in choosing the optimal sequencing depth, balancing between its positive effect on contiguity and negative effect on accuracy due to its interaction with error rate. Furthermore, the properties of the genomes to be sequenced also should be taken into account, as they might influence the effects of error sources themselves.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    限制基因取样对生物造成的危害是稀有物种的重要考虑因素,并且已经开发了许多无损采样技术来解决淡水贻贝的这一问题。两种方法,内脏擦拭和组织活检,已经证明对DNA取样是有效的,尽管尚不清楚哪种方法更适合测序基因分型(GBS)。组织活检可能会对生物体造成过度的压力和损害,而内脏擦拭可能会减少这种伤害的机会。我们的研究比较了这两种DNA采样方法在生成unionid淡水贻贝GBS数据中的功效,德州猪头(Fusconaiaaskewi)。我们的结果发现两种方法都能产生质量序列数据,虽然有些考虑是有序的。与拭子相比,组织活检产生了更高的DNA浓度和更多的读数,尽管起始DNA浓度和产生的读数数量之间没有显着关联。擦拭产生更大的序列深度(每个序列更多的读段),而组织活检显示整个基因组的覆盖率更高(在较低的序列深度)。无论采样方法如何,主成分分析中表征的基因组变异模式都是相似的,这表明侵入性较小的拭子是在这些生物体中产生高质量GBS数据的可行选择。
    Limiting harm to organisms caused by genetic sampling is an important consideration for rare species, and a number of non-destructive sampling techniques have been developed to address this issue in freshwater mussels. Two methods, visceral swabbing and tissue biopsies, have proven to be effective for DNA sampling, though it is unclear as to which method is preferable for genotyping-by-sequencing (GBS). Tissue biopsies may cause undue stress and damage to organisms, while visceral swabbing potentially reduces the chance of such harm. Our study compared the efficacy of these two DNA sampling methods for generating GBS data for the unionid freshwater mussel, the Texas pigtoe (Fusconaia askewi). Our results find both methods generate quality sequence data, though some considerations are in order. Tissue biopsies produced significantly higher DNA concentrations and larger numbers of reads when compared with swabs, though there was no significant association between starting DNA concentration and number of reads generated. Swabbing produced greater sequence depth (more reads per sequence), while tissue biopsies revealed greater coverage across the genome (at lower sequence depth). Patterns of genomic variation as characterized in principal component analyses were similar regardless of the sampling method, suggesting that the less invasive swabbing is a viable option for producing quality GBS data in these organisms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Next-generation sequencing (NGS) has raised a growing interest in phage display research. Sequencing depth is a pivotal parameter for using NGS. In the current study, we made a side-by-side comparison of two NGS platforms with different sequencing depths, denoted as lower-throughput (LTP) and higher-throughput (HTP). The capacity of these platforms for characterization of the composition, quality, and diversity of the unselected Ph.D.TM-12 Phage Display Peptide Library was investigated. Our results indicated that HTP sequencing detects a considerably higher number of unique sequences compared to the LTP platform, thus covering a broader diversity of the library. We found a larger percentage of singletons, a smaller percentage of repeated sequences, and a greater percentage of distinct sequences in the LTP datasets. These parameters suggest a higher library quality, resulting in potentially misleading information when using LTP sequencing for such assessment. Our observations showed that HTP reveals a broader distribution of peptide frequencies, thus revealing increased heterogeneity of the library by the HTP approach and offering a comparatively higher capacity for distinguishing peptides from each other. Our analyses suggested that LTP and HTP datasets show discrepancies in their peptide composition and position-specific distribution of amino acids within the library. Taken together, these findings lead us to the conclusion that a higher sequencing depth can yield more in-depth insights into the composition of the library and provide a more complete picture of the quality and diversity of phage display peptide libraries.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Randomized Controlled Trial
    未经批准:在过去的十年中,系统疫苗学领域已经出现,其中高通量转录组学和其他组学测定用于探测对疫苗接种反应的先天和适应性免疫系统的变化。这项研究的目的是在多位点的背景下对RNA测序(RNA-seq)的关键技术和分析参数进行基准测试。双盲随机疫苗临床试验。
    UNASSIGNED:我们收集了10名受试者在用减毒活的土力弗朗西斯菌疫苗接种前后的纵向外周血单核细胞(PBMC)样品,并使用来自同一样品的等分试样在两个不同位点进行RNA-Seq,以生成两个重复数据集(每个50个样品的5个时间点)。我们评估了(i)过滤低表达基因的影响,(ii)使用外部RNA对照,(iii)倍数变化和错误发现率(FDR)过滤,(iv)读取长度,和(v)重复数据集之间的差异表达基因(DEGs)一致性的测序深度。使用合成的mRNA刺突蛋白,我们开发了一种根据经验建立最小读取计数阈值的方法,以在每个实验的基础上保持倍数变化的准确性.我们通过汇集序列数据定义了参考PBMC转录组,并建立了测序深度和基因过滤对转录组表示的影响。最后,我们对一系列样本大小的DEG检测统计能力进行了建模,效果大小,和排序深度。
    UNASSIGNED:我们的结果表明,(i)建议过滤低表达的基因以提高倍数变化的准确性和位点间的一致性,如果可能,通过mRNA尖峰蛋白(ii)阅读长度对DEG检测没有重大影响,(iii)对DEG检测应用倍数变化截止值减少了内部一致性,应谨慎使用,如果有的话,(iv)测序深度的减少对统计能力的影响最小,但减少了PBMC转录组的可识别部分,(V)样本量后,效应大小(即倍数变化的大小)是检测DEG的统计能力的最重要驱动因素。这项研究的结果为规划未来的类似疫苗研究提供了RNA测序基准和指南。
    Over the last decade, the field of systems vaccinology has emerged, in which high throughput transcriptomics and other omics assays are used to probe changes of the innate and adaptive immune system in response to vaccination. The goal of this study was to benchmark key technical and analytical parameters of RNA sequencing (RNA-seq) in the context of a multi-site, double-blind randomized vaccine clinical trial.
    We collected longitudinal peripheral blood mononuclear cell (PBMC) samples from 10 subjects before and after vaccination with a live attenuated Francisella tularensis vaccine and performed RNA-Seq at two different sites using aliquots from the same sample to generate two replicate datasets (5 time points for 50 samples each). We evaluated the impact of (i) filtering lowly-expressed genes, (ii) using external RNA controls, (iii) fold change and false discovery rate (FDR) filtering, (iv) read length, and (v) sequencing depth on differential expressed genes (DEGs) concordance between replicate datasets. Using synthetic mRNA spike-ins, we developed a method for empirically establishing minimal read-count thresholds for maintaining fold change accuracy on a per-experiment basis. We defined a reference PBMC transcriptome by pooling sequence data and established the impact of sequencing depth and gene filtering on transcriptome representation. Lastly, we modeled statistical power to detect DEGs for a range of sample sizes, effect sizes, and sequencing depths.
    Our results showed that (i) filtering lowly-expressed genes is recommended to improve fold-change accuracy and inter-site agreement, if possible guided by mRNA spike-ins (ii) read length did not have a major impact on DEG detection, (iii) applying fold-change cutoffs for DEG detection reduced inter-set agreement and should be used with caution, if at all, (iv) reduction in sequencing depth had a minimal impact on statistical power but reduced the identifiable fraction of the PBMC transcriptome, (v) after sample size, effect size (i.e. the magnitude of fold change) was the most important driver of statistical power to detect DEG. The results from this study provide RNA sequencing benchmarks and guidelines for planning future similar vaccine studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号