结构变体(SV)是DNA突变的重要类型之一,通常被定义为大于50bp的基因组改变,包括插入,删除,重复,倒置,和易位。这些修饰可以深刻影响表型特征,并导致癌症等疾病,对治疗的反应,和感染。已经使用三个牛津纳米孔NGS人类基因组数据集对四个长读对准器和五个SV调用者进行了精度评估,召回,和F1分数统计指标,覆盖深度,分析的速度。关于召回的最佳SV呼叫者,精度,当在不同覆盖水平下与不同对准者匹配时,和F1得分倾向于根据所分析的数据集和特定SV类型而变化。然而,根据我们的发现,Sniffles和CuteSV往往在不同的对准器和覆盖级别上表现良好,其次是SVIM,PBSV,和SVDSS在最后一个地方。CuteSV呼叫者的平均F1得分最高(82.51%)和召回率最高(78.50%),和Sniffles具有最高的平均精度值(94.33%)。Minimap2作为对准器和Sniffles作为SV呼叫者,由于它们的高速和合理的成就,它们是SV呼叫管道的坚实基础。PBSV的平均F1得分较低,精度,和召回,并可能产生更多的误报,忽视一些实际的SV。我们的结果在对流行的SV调用者和对齐器的综合评估中很有价值,因为它们可以深入了解几种长读对齐器和SV调用者的性能,并为研究人员选择最适合SV检测的工具提供参考。
Structural variants (SVs) are one of the significant types of DNA mutations and are typically defined as larger-than-50-bp genomic alterations that include insertions, deletions, duplications, inversions, and translocations. These modifications can profoundly impact the phenotypic characteristics and contribute to disorders like cancer, response to treatment, and infections. Four long-read aligners and five SV callers have been evaluated using three Oxford Nanopore NGS human genome datasets in terms of precision, recall, and F1-score statistical metrics, depth of coverage, and speed of analysis. The best SV caller regarding recall, precision, and F1-score when matched with different aligners at different coverage levels tend to vary depending on the dataset and the specific SV types being analyzed. However, based on our findings, Sniffles and CuteSV tend to perform well across different aligners and coverage levels, followed by SVIM, PBSV, and SVDSS in the last place. The CuteSV caller has the highest average F1-score (82.51%) and recall (78.50%), and Sniffles has the highest average precision value (94.33%). Minimap2 as an aligner and Sniffles as an SV caller act as a strong base for the pipeline of SV calling because of their high speed and reasonable accomplishment. PBSV has a lower average F1-score, precision, and recall and may generate more false positives and overlook some actual SVs. Our results are valuable in the comprehensive evaluation of popular SV callers and aligners as they provide insight into the performance of several long-read aligners and SV callers and serve as a reference for researchers in selecting the most suitable tools for SV detection.