关键词: Guppy MinION Oxford Nanopore Technology (ONT) antimicrobial resistance (AMR) antimicrobial resistance genes (ARG) de novo assembly long-read sequencing (LRS) plasmid

来  源:   DOI:10.3389/fmicb.2022.796465   PDF(Pubmed)

Abstract:
Long-read sequencing (LRS) can resolve repetitive regions, a limitation of short read (SR) data. Reduced cost and instrument size has led to a steady increase in LRS across diagnostics and research. Here, we re-basecalled FAST5 data sequenced between 2018 and 2021 and analyzed the data in relation to gDNA across a large dataset (n = 200) spanning a wide GC content (25-67%). We examined whether re-basecalled data would improve the hybrid assembly, and, for a smaller cohort, compared long read (LR) assemblies in the context of antimicrobial resistance (AMR) genes and mobile genetic elements. We included a cost analysis when comparing SR and LR instruments. We compared the R9 and R10 chemistries and reported not only a larger yield but increased read quality with R9 flow cells. There were often discrepancies with ARG presence/absence and/or variant detection in LR assemblies. Flye-based assemblies were generally efficient at detecting the presence of ARG on both the chromosome and plasmids. Raven performed more quickly but inconsistently recovered small plasmids, notably a ∼15-kb Col-like plasmid harboring bla KPC . Canu assemblies were the most fragmented, with genome sizes larger than expected. LR assemblies failed to consistently determine multiple copies of the same ARG as identified by the Unicycler reference. Even with improvements to ONT chemistry and basecalling, long-read assemblies can lead to misinterpretation of data. If LR data are currently being relied upon, it is necessary to perform multiple assemblies, although this is resource (computing) intensive and not yet readily available/useable.
摘要:
长读测序(LRS)可以解析重复性区域,短读取(SR)数据的限制。成本和仪器尺寸的降低导致诊断和研究中LRS的稳定增长。这里,我们对2018年至2021年之间测序的FAST5数据进行了基础分析,并分析了跨越广泛GC含量(25-67%)的大型数据集(n=200)中与gDNA相关的数据.我们检查了重新建立基础的数据是否会改善混合组件,and,对于一个较小的队列,在抗菌素耐药性(AMR)基因和可移动遗传元件的背景下,比较了长读(LR)组装。在比较SR和LR仪器时,我们包括了成本分析。我们比较了R9和R10的化学性质,并报道了R9流动池不仅具有更大的产量,而且提高了读取质量。LR组件中ARG的存在/不存在和/或变体检测通常存在差异。基于Flye的组装体通常在检测染色体和质粒上ARG的存在方面是有效的。Raven的表现更快,但恢复的小质粒不一致,特别是携带blaKPC的15kbCol样质粒。卡努议会是最分散的,基因组大小大于预期。LR程序集无法一致地确定Unicycler参考所标识的同一ARG的多个副本。即使对ONT化学和碱基判定进行了改进,长时间读取的程序集可能导致对数据的误解。如果当前依赖LR数据,有必要执行多个组件,虽然这是资源(计算)密集型,但尚不容易获得/使用。
公众号