关键词: Error-correction Genome assembly Human genomes Long reads Segmental duplication

Mesh : Humans Genome, Human DNA Copy Number Variations High-Throughput Nucleotide Sequencing / methods Software Nanopore Sequencing / methods Sequence Analysis, DNA / methods Genomics / methods

来  源:   DOI:10.1186/s13059-024-03252-4   PDF(Pubmed)

Abstract:
Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.
摘要:
长读测序数据,特别是来自牛津纳米孔测序平台的那些,往往表现出较高的错误率。这里,我们介绍NextDenovo,一个有效的纠错和组装工具,用于嘈杂的长时间读取,这在基因组组装中实现了高水平的准确性。我们应用NextDenovo使用Nanopore长读数据组装来自世界各地的35个不同的人类基因组。这些基因组使我们能够识别现代人群中片段复制和基因拷贝数变异的景观。NextDenovo的使用应该为使用Nanopore长读数据的群体规模长读组装铺平道路。
公众号