关键词: deep learning false-positives long-read sequencing structural variation detection

Mesh : Algorithms Humans Sequence Analysis, DNA / methods High-Throughput Nucleotide Sequencing / methods Genomics / methods Genomic Structural Variation Software

来  源:   DOI:10.1093/bib/bbae336   PDF(Pubmed)

Abstract:
Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.
摘要:
结构变异(SV)是基因组变异的一种重要形式,它通过改变基因组的结构来影响基因功能和表达。尽管长读数据已被证明可以更好地表征SV,从嘈杂的长读数据中检测到的SV仍然包括相当一部分的误报呼叫。为了准确检测长读取数据中的SV,我们介绍SVDF,一种采用基于学习的噪声过滤策略和SV签名自适应聚类算法的方法,有效降低假阳性事件的可能性。多个正交实验的基准测试结果表明,跨越不同的测序平台和深度,与几种现有的通用SV调用工具相比,SVDF为每个样本实现了更高的调用精度。我们相信,凭借其细致而灵敏的SV检测能力,SVDF可以为前沿基因组研究带来新的机遇和进步。
公众号