关键词: Long-reads Pipeline evaluation SV caller Sequence aligner Structural variation Third-generation sequencing

Mesh : Humans High-Throughput Nucleotide Sequencing / methods Genomic Structural Variation Software Sequence Analysis, DNA / methods

来  源:   DOI:10.1186/s13059-024-03324-5   PDF(Pubmed)

Abstract:
BACKGROUND: Structural variation (SV) detection methods using third-generation sequencing data are widely employed, yet accurately detecting SVs remains challenging. Different methods often yield inconsistent results for certain SV types, complicating tool selection and revealing biases in detection.
RESULTS: This study comprehensively evaluates 53 SV detection pipelines using simulated and real data from PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms. We assess their performance in detecting various sizes and types of SVs, breakpoint biases, and genotyping accuracy with various sequencing depths. Notably, pipelines such as Minimap2-cuteSV2, NGMLR-SVIM, PBMM2-pbsv, Winnowmap-Sniffles2, and Winnowmap-SVision exhibit comparatively higher recall and precision. Our findings also show that combining multiple pipelines with the same aligner, like pbmm2 or winnowmap, can significantly enhance performance. The individual pipelines\' detailed ranking and performance metrics can be viewed in a dynamic table: http://pmglab.top/SVPipelinesRanking .
CONCLUSIONS: This study comprehensively characterizes the strengths and weaknesses of numerous pipelines, providing valuable insights that can improve SV detection in third-generation sequencing data and inform SV annotation and function prediction.
摘要:
背景:使用第三代测序数据的结构变异(SV)检测方法被广泛采用,然而准确检测SV仍然具有挑战性。对于某些SV类型,不同的方法通常会产生不一致的结果,复杂的工具选择和揭示偏见的检测。
结果:本研究使用来自PacBio的模拟和真实数据(CLR:连续长读,CCS:循环共有测序)和纳米孔(ONT)平台。我们评估它们在检测各种大小和类型的SV时的性能,断点偏差,和不同测序深度的基因分型准确性。值得注意的是,管道,如Minimap2-cuteSV2、NGMLR-SVIM、PBMM2-pbsv,Winnowmap-Sniffles2和Winnowmap-SVision表现出相对较高的召回率和精确度。我们的发现还表明,将多个管道与相同的对准器组合在一起,像pbmm2或winnowmap,可以显著提高性能。可以在动态表中查看各个管道的详细排名和性能指标:http://pmglab。top/SVPipelinesRanking。
结论:这项研究全面描述了众多管道的优缺点,提供有价值的见解,可以改善第三代测序数据中的SV检测,并为SV注释和功能预测提供信息。
公众号