关键词: Antimicrobial resistance Experimental design Sequencing depth Shotgun metagenomics

Mesh : Humans Metagenomics / methods Dental Plaque / microbiology Microbiota / genetics Bacteria / genetics classification isolation & purification Drug Resistance, Bacterial / genetics Sequence Analysis, DNA / methods Metagenome

来  源:   DOI:10.1007/s00253-024-13152-z   PDF(Pubmed)

Abstract:
Shotgun metagenomics sequencing experiments are finding a wide range of applications. Nonetheless, there are still limited guidelines regarding the number of sequences needed to acquire meaningful information for taxonomic profiling and antimicrobial resistance gene (ARG) identification. In this study, we explored this issue in the context of oral microbiota by sequencing with a very high number of sequences (~ 100 million), four human plaque samples, and one microbial community standard and by evaluating the performance of microbial identification and ARGs detection through a downsampling procedure. When investigating the impact of a decreasing number of sequences on quantitative taxonomic profiling in the microbial community standard datasets, we found some discrepancies in the identified microbial species and their abundances when compared to the expected ones. Such differences were consistent throughout downsampling, suggesting their link to taxonomic profiling methods limitations. Overall, results showed that the number of sequences has a great impact on metagenomic samples at the qualitative (i.e., presence/absence) level in terms of loss of information, especially in experiments having less than 40 million reads, whereas abundance estimation was minimally affected, with only slight variations observed in low-abundance species. The presence of ARGs was also assessed: a total of 133 ARGs were identified. Notably, 23% of them inconsistently resulted as present or absent across downsampling datasets of the same sample. Moreover, over half of ARGs were lost in datasets having less than 20 million reads. This study highlights the importance of carefully considering sequencing aspects and suggests some guidelines for designing shotgun metagenomics experiments with the final goal of maximizing oral microbiome analyses. Our findings suggest varying optimized sequence numbers according to different study aims: 40 million for microbiota profiling, 50 million for low-abundance species detection, and 20 million for ARG identification. KEY POINTS: • Forty million sequences are a cost-efficient solution for microbiota profiling • Fifty million sequences allow low-abundance species detection • Twenty million sequences are recommended for ARG identification.
摘要:
鸟枪宏基因组学测序实验正在发现广泛的应用。尽管如此,关于获取有意义的信息以进行分类学分析和抗微生物药物耐药基因(ARG)鉴定所需的序列数量的指南仍然有限.在这项研究中,我们在口腔微生物群的背景下探索了这个问题,通过使用非常高数量的序列(约1亿条)进行测序,四个人斑块样本,和一个微生物群落标准,并通过降采样程序评估微生物鉴定和ARGs检测的性能。当调查减少数量的序列对微生物群落标准数据集的定量分类分析的影响时,与预期相比,我们发现已确定的微生物种类及其丰度存在一些差异。这种差异在整个向下抽样中是一致的,表明它们与分类学分析方法限制的联系。总的来说,结果表明,序列的数量对宏基因组样本在定性(即,存在/不存在)信息丢失的水平,尤其是在阅读量不到4000万次的实验中,而丰度估计受到的影响最小,在低丰度物种中仅观察到微小的变化。还评估了ARGs的存在:总共鉴定了133个ARGs。值得注意的是,其中23%的结果不一致,在同一样本的下采样数据集中存在或不存在。此外,超过一半的ARG在阅读量少于2000万的数据集中丢失。这项研究强调了仔细考虑测序方面的重要性,并提出了一些设计鸟枪宏基因组学实验的指南,最终目标是最大化口腔微生物组分析。我们的研究结果表明,根据不同的研究目标,不同的优化序列号:4000万用于微生物区系分析,5000万用于低丰度物种检测,和2000万用于ARG识别。关键点:•四千万个序列是用于微生物区系分析的成本有效的解决方案•五千万个序列允许低丰度物种检测•两千万个序列被推荐用于ARG鉴定。
公众号