关键词: 16S rRNA Amplicon data analysis Amplicon sequencing ITS Long read Microbiome Short read

Mesh : RNA, Ribosomal, 16S Software Reproducibility of Results Sequence Analysis Soil High-Throughput Nucleotide Sequencing / methods Sequence Analysis, DNA / methods

来  源:   DOI:10.1186/s40168-022-01365-1   PDF(Pubmed)

Abstract:
Amplicon sequencing is an established and cost-efficient method for profiling microbiomes. However, many available tools to process this data require both bioinformatics skills and high computational power to process big datasets. Furthermore, there are only few tools that allow for long read amplicon data analysis. To bridge this gap, we developed the LotuS2 (less OTU scripts 2) pipeline, enabling user-friendly, resource friendly, and versatile analysis of raw amplicon sequences.
In LotuS2, six different sequence clustering algorithms as well as extensive pre- and post-processing options allow for flexible data analysis by both experts, where parameters can be fully adjusted, and novices, where defaults are provided for different scenarios. We benchmarked three independent gut and soil datasets, where LotuS2 was on average 29 times faster compared to other pipelines, yet could better reproduce the alpha- and beta-diversity of technical replicate samples. Further benchmarking a mock community with known taxon composition showed that, compared to the other pipelines, LotuS2 recovered a higher fraction of correctly identified taxa and a higher fraction of reads assigned to true taxa (48% and 57% at species; 83% and 98% at genus level, respectively). At ASV/OTU level, precision and F-score were highest for LotuS2, as was the fraction of correctly reported 16S sequences.
LotuS2 is a lightweight and user-friendly pipeline that is fast, precise, and streamlined, using extensive pre- and post-ASV/OTU clustering steps to further increase data quality. High data usage rates and reliability enable high-throughput microbiome analysis in minutes.
LotuS2 is available from GitHub, conda, or via a Galaxy web interface, documented at http://lotus2.earlham.ac.uk/ . Video Abstract.
摘要:
扩增子测序是用于分析微生物组的已建立且具有成本效益的方法。然而,许多可用的工具来处理这些数据需要生物信息学技能和高计算能力来处理大数据集。此外,只有很少的工具,允许长读扩增子数据分析。为了弥合这个差距,我们开发了LotuS2(更少的OTU脚本2)管道,启用用户友好,资源友好,和原始扩增子序列的通用分析。
在LotuS2中,六种不同的序列聚类算法以及广泛的预处理和后处理选项允许两位专家进行灵活的数据分析,其中参数可以完全调整,和新手,其中为不同的场景提供默认值。我们对三个独立的肠道和土壤数据集进行了基准测试,其中LotuS2平均比其他管道快29倍,还可以更好地再现技术复制样本的α-和β-多样性。进一步对具有已知分类单元组成的模拟社区进行基准测试表明,与其他管道相比,LotuS2回收了较高比例的正确识别分类单元和较高比例的分配给真实分类单元的读数(物种分别为48%和57%;属水平为83%和98%,分别)。在ASV/OTU级别,LotuS2的精确度和F评分最高,正确报告的16S序列的分数也是如此.
LotuS2是一个轻量级和用户友好的管道,速度快,精确,流线型,使用广泛的前和后ASV/OTU聚类步骤来进一步提高数据质量。高数据使用率和可靠性可在几分钟内实现高通量微生物组分析。
LotuS2可从GitHub获得,康达,或者通过银河网络界面,记录在http://lotus2。earlham.AC.英国/。视频摘要。
公众号