关键词: Oxford Nanopore Pacific Biosciences next-generation sequencing single genome sequencing

来  源:   DOI:10.1089/AID.2024.0012

Abstract:
Our goal was to assess the accuracy of next generation sequencing (NGS) compared with Sanger. We performed single genome amplification (SGA) of HIV-1 gp160 on extracted tissue DNA from two HIV+ individuals. Amplicons (n = 30) were sequenced with Sanger or reamplified with barcoded primers and pooled before sequencing using Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PB). For each amplicon, a consensus sequence for NGS reads was obtained by (1) mapping reads to the Sanger sequence when available (\"reference-based\") or (2) mapping reads to a \"pseudo-reference\" sequence, i.e., a consensus sequence of a subset of NGS reads (\"reference-free\"). PB reads were clustered based on genetic similarity. A Sanger consensus sequence was obtained for 23/30 amplicons, for which all NGS consensus sequences were identical (n = 9) or nearly identical (n = 14) compared with Sanger. For the nine mismatches between Sanger/NGS, the nucleotide in the NGS sequence matched all other sequences from that patient. Of the 7/30 amplicons without a Sanger sequence, NGS sequences had ≥35 ambiguous calls in five amplicons and 0 ambiguities in two amplicons. Analysis of the electropherograms showed failure of a single sequencing primer for the latter two amplicons (consistent with a single template) and overlapping peaks for the other five (consistent with multiple templates). Clustering results closely followed the Sanger/NGS consensus results, where amplicons derived from a single template also had a single cluster and vice versa (with one exception, which could be the result of barcode misidentification). Representative sequences from the clusters contained 2-13 differences compared with Sanger/NGS. In summary, we show that both ONT and PB can produce amplicon consensus sequences with similar or higher accuracy compared with Sanger and, importantly, without the need for a known reference sequence. Clustering could be useful in some circumstances to predict or confirm the presence of multiple starting templates.
摘要:
我们的目标是评估下一代测序(NGS)与Sanger相比的准确性。我们对来自两个HIV+个体的提取组织DNA进行了HIV-1gp160的单基因组扩增(SGA)。用Sanger对扩增子(n=30)进行测序,或用条形码引物重新扩增并在使用牛津纳米孔技术[ONT]和太平洋生物科学[PB]进行测序之前合并。对于每个扩增子,NGS读段的共有序列是通过(1)将读段映射到Sanger序列(“基于参考的”)或(2)将读段映射到“伪参考”序列获得的,即,NGS读段子集的共有序列(“无参考”)。基于遗传相似性对PB读数进行聚类。获得23/30扩增子的Sanger共有序列,与Sanger相比,所有NGS共有序列相同[n=9]或几乎相同[n=14]。对于Sanger/NGS之间的九个不匹配,NGS序列中的核苷酸与该患者的所有其他序列相匹配。在没有Sanger序列的7/30扩增子中,NGS序列在五个扩增子中有35个模糊的调用,两个扩增子中的模糊度和0。电泳图分析显示后两个扩增子的单个测序引物失败(与单个模板一致),和其他五个的重叠峰(与多个模板一致)。聚类结果紧随Sanger/NGS共识结果,其中来自单个模板的扩增子也具有单个簇,反之亦然(除了一个例外,这可能是条形码错误识别的结果)。来自簇的代表性序列与Sanger/NGS相比包含2-13个差异。总之,我们表明,与Sanger相比,ONT和PB都可以产生具有相似或更高准确性的扩增子共有序列,而且重要的是,不需要已知的参考序列。在某些情况下,聚类可能有助于预测或确认多个起始模板的存在。
公众号