关键词: NGS WES reference genome variant annotation variant calling

Mesh : Humans Rare Diseases / diagnosis genetics Computational Biology Genomics Genome, Human Germ Cells High-Throughput Nucleotide Sequencing

来  源:   DOI:10.1093/bib/bbad508   PDF(Pubmed)

Abstract:
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
摘要:
下一代测序(NGS)彻底改变了罕见疾病诊断领域。全外显子组和全基因组测序现在常规用于诊断目的;然而,总体诊断率仍低于预期.在这项工作中,我们回顾了目前用于调用和解释人类基因组中种系遗传变异的方法,并讨论了医学遗传学中NGS数据的生物信息学分析中存在的最重要的挑战。我们描述并尝试定量评估剩余的问题,例如参考基因组序列的质量,可重复的覆盖偏差,或基因组复杂区域的变异识别准确性。我们还讨论了转换为完整人类基因组组装或人类泛基因组的前景以及与这种转换相关的重要警告。我们谈到了医学基因组学NGS数据分析中最难的问题,即,遗传变异的注释及其后续解释。我们强调了编码和非编码变体的注释和优先级排序的最具挑战性的方面。最后,我们证明了编码基因组中致病性变异的持续流行,并概述了可能提高基于NGS的疾病诊断效率的研究方向。
公众号